CN106294542B - A kind of letters and calls data mining methods of marking and system - Google Patents
A kind of letters and calls data mining methods of marking and system Download PDFInfo
- Publication number
- CN106294542B CN106294542B CN201610585288.2A CN201610585288A CN106294542B CN 106294542 B CN106294542 B CN 106294542B CN 201610585288 A CN201610585288 A CN 201610585288A CN 106294542 B CN106294542 B CN 106294542B
- Authority
- CN
- China
- Prior art keywords
- data
- letters
- calls
- mining
- keyword
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2465—Query processing support for facilitating data mining operations in structured databases
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Probability & Statistics with Applications (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Fuzzy Systems (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present invention relates to a kind of method and system of letters and calls data mining scoring, wherein method includes:Step 1:Qualified letters and calls data are extracted from large database concept to be handled, and obtain being adapted in the mining data deposit mining data storehouse of data mining, all history letters and calls data are preserved in the large database concept;Step 2:At least one keyword is extracted to the mining data in mining data storehouse, feature extraction is carried out to mining data point based on each keyword, obtains the analytical table for each keyword;Step 3:Statistical analysis is carried out according to the mining data at least one analytical table, a weighted value for each keyword is obtained, comprehensive grading standard is established based on each self-corresponding weighted value of different keywords.The present invention, which incorporates, is dispersed in each system and all letters and calls data isolated between each other, by establishing standards of grading, letters and calls data can be sorted out and be counted, and is easy to handle letters and calls data in next step.
Description
Technical field
The present invention relates to a kind of method and system of letters and calls data mining scoring, belong to field of computer technology.
Background technology
Letters and calls, refer to citizen, legal person or other tissues using letter, Email, fax, phone, the form such as visit,
Report situations, advise, opinion or complaint request to people's governments at all levels, department of the people's government at or above the county level, according to
The activity that method is handled by relevant administration.
Letters and calls be except it is exlex another solve problem method, be one kind than relatively straightforward articulation of interests form.
The surge of the volume of letters in recent years has triggered a large amount of aggregations of letters and calls data, how to change into these letters and calls data multi-level, more
The information and knowledge of dimension, the logic association of data behind is disclosed, so as to effectively solve letters and calls protrusion from policy aspect for government
Contradiction, it is the major issue that letters and calls research field is faced.The depth analysis to letters and calls data is realized, is to solve this problem
Prerequisite.
Our uses for letters and calls data remain in the layer that the top layers such as typing, inquiry, simple statistics are collected at present
Face, profound logic association under covering in letters and calls data can not be found.And the logic association of these data behinds is just society
The very crux of meeting contradiction, it is the important evidence that guide policy is worked out.
The content of the invention
The technical problems to be solved by the invention are that do not have unified large database concept for prior art, for letters and calls data
It can not call as needed, and the deficiency that can not be solved in time to problem present in letters and calls data, there is provided a kind of letters and calls number
According to the method and system for excavating scoring.
The technical scheme that the present invention solves above-mentioned technical problem is as follows:A kind of method of letters and calls data mining scoring, including
Following steps:
Step 1:Qualified letters and calls data are extracted from large database concept to be handled, and obtain the digging for being adapted to data mining
Dig in data deposit mining data storehouse, all history letters and calls data are preserved in the large database concept;
Step 2:At least one keyword is extracted to the mining data in mining data storehouse, based on each keyword to excavating
Data carry out feature extraction, obtain the analytical table for each keyword;
Step 3:Statistical analysis is carried out according to the mining data at least one analytical table, obtains being directed to each keyword
A weighted value, comprehensive grading standard is established based on each self-corresponding weighted value of different keywords.
The beneficial effects of the invention are as follows:The present invention, which incorporates, is dispersed in each system and all letters isolated between each other
Visit data, automatic decimation pattern, association, change, abnormal and significant structure from letters and calls data, from increasing letters and calls
Valuable knowledge is excavated in data, so as to reach with numeral reflection law contradiction, the purpose for the decision-making that advanced science with rule.This
Letters and calls item comprehensive grading system in invention can predict in the recent period it is possible that too drastic letters and calls item and too drastic letters and calls people,
To cause the attention of each relevant departments, social contradications prevention neutralizing is highly profitable.
On the basis of above-mentioned technical proposal, the present invention can also do following improvement.
Further, the letters and calls data to be prestored in the large database concept include mail, the electronics postal obtained by data acquisition
Part, voice, video and the data such as visiting.
Further, extracting the process of letters and calls data in the step 1 from large database concept includes:
In large database concept when there is data to change, the mode of passage time stamp condition or Update log counts from big
The data to be changed according to being extracted in storehouse, obtained data are qualified letters and calls data.
Further, processing of the step 1 to letters and calls data includes data scrubbing and data convert;
The data scrubbing obtains the letters and calls data scrubbing of extraction without the standard letters and calls data repeated;
The data, which become, changes commanders standard letters and calls data from transactional data conversion into the mining data of suitable data mining.
Further, the data scrubbing includes duplicate removal, standardized data item and denoising operation, and the duplicate removal is by letters and calls data
The middle data for repeating typing remove;The standardized data item sorts the letters and calls data of multi-form typing according to unified standard
Record, makes the data after processing be more easy to count;The denoising removes the noise data in letters and calls data.
Further, the process of data conversion includes smooth aggregation, Data generalization, standardization, Concept Hierarchies and discrete
The operation such as change.
Further, the keyword in the step 2 include too drastic number, letters and calls number, letters and calls number, letters and calls approach number and
Letters and calls are time-consuming etc..
Further, different keyword roots obtain the percentage with integrally scoring according to each self-corresponding weighted value in the step 3
Than, by percentage corresponding to all keywords by from big to small sort after establish comprehensive grading standard;Wherein described weighted value is got over
Big percentage is bigger.
The technical scheme that the present invention solves above-mentioned technical problem is as follows:A kind of system of letters and calls data mining scoring, including:
Abstraction module, qualified letters and calls data are extracted from large database concept and are handled, obtain being adapted to data mining
Mining data deposit mining data storehouse in, all history letters and calls data are preserved in the large database concept;
Module is excavated, at least one keyword is extracted to the mining data in mining data storehouse, based on each keyword pair
Mining data carries out feature extraction, obtains the analytical table for each keyword;
Standard establishes module, carries out statistical analysis according to the mining data at least one analytical table, obtains for every
One weighted value of individual keyword, comprehensive grading standard is established based on each self-corresponding weighted value of different keywords.
On the basis of above-mentioned technical proposal, the present invention can also do following improvement.
Further, the letters and calls data to be prestored in the large database concept include mail, the electronics postal obtained by data acquisition
Part, voice, video and the data such as visiting.
Further, extracting the process of letters and calls data in the abstraction module from large database concept includes:
In large database concept when there is data to change, the mode of passage time stamp condition or Update log counts from big
The data to be changed according to being extracted in storehouse, obtained data are qualified letters and calls data.
Further, processing of the abstraction module to letters and calls data includes data scrubbing and data convert;
The data scrubbing obtains the letters and calls data scrubbing of extraction without the standard letters and calls data repeated;
The data, which become, changes commanders standard letters and calls data from transactional data conversion into the mining data of suitable data mining.
Further, the data scrubbing includes duplicate removal, standardized data item and denoising operation, and the duplicate removal is by letters and calls data
The middle data for repeating typing remove;The standardized data item sorts the letters and calls data of multi-form typing according to unified standard
Record, makes the data after processing be more easy to count;The denoising removes the noise data in letters and calls data.
Further, the process of data conversion includes smooth aggregation, Data generalization, standardization, Concept Hierarchies and discrete
The operation such as change.
Further, the keyword excavated in module includes too drastic number, letters and calls number, letters and calls number, letters and calls approach
Number and letters and calls are time-consuming etc..
Further, the standard is established different keyword roots in module and obtained and overall scoring according to each self-corresponding weighted value
Percentage, by percentage corresponding to all keywords by from big to small sort after establish comprehensive grading standard;Wherein described power
Weight values are bigger, and percentage is bigger.
Brief description of the drawings
Fig. 1 is a kind of method flow diagram of letters and calls data mining scoring described in the embodiment of the present invention 1;
Fig. 2 is a kind of system structure diagram of letters and calls data mining scoring described in the embodiment of the present invention 2.
In accompanying drawing, the list of parts representated by each label is as follows:
1st, abstraction module, 2, excavate module, 3, standard establish module.
Embodiment
The principle and feature of the present invention are described below in conjunction with accompanying drawing, the given examples are served only to explain the present invention, and
It is non-to be used to limit the scope of the present invention.
As shown in figure 1, be a kind of method of letters and calls data mining scoring described in the embodiment of the present invention 1, including following step
Suddenly:
Step 1:Qualified letters and calls data are extracted from large database concept to be handled, and obtain the digging for being adapted to data mining
Dig in data deposit mining data storehouse, all history letters and calls data are preserved in the large database concept;
Step 2:At least one keyword is extracted to the mining data in mining data storehouse, based on each keyword to excavating
Data carry out feature extraction, obtain the analytical table for each keyword;
Step 3:Statistical analysis is carried out according to the mining data at least one analytical table, obtains being directed to each keyword
A weighted value, comprehensive grading standard is established based on each self-corresponding weighted value of different keywords.
The letters and calls data to be prestored in the large database concept include by data acquisition acquisition mail, Email, voice,
Video and the data such as visiting.
Extracting the process of letters and calls data in the step 1 from large database concept includes:
In large database concept when there is data to change, the mode of passage time stamp condition or Update log counts from big
The data to be changed according to being extracted in storehouse, obtained data are qualified letters and calls data.
Processing of the step 1 to letters and calls data includes data scrubbing and data convert;
The data scrubbing obtains the letters and calls data scrubbing of extraction without the standard letters and calls data repeated;
The data, which become, changes commanders standard letters and calls data from transactional data conversion into the mining data of suitable data mining.
The data scrubbing includes duplicate removal, standardized data item and denoising operation, and the duplicate removal will repeat in letters and calls data
The data of typing remove;The standardized data item by the letters and calls data of multi-form typing according to unified standard order recording,
The data after processing are made to be more easy to count;The denoising removes the noise data in letters and calls data.
The process of the data conversion includes the behaviour such as smooth aggregation, Data generalization, standardization, Concept Hierarchies and discretization
Make.
Keyword in the step 2 includes too drastic number, letters and calls number, letters and calls number, letters and calls approach number and letters and calls consumption
When etc..
The percentage that different keyword roots are obtained and integrally scored according to each self-corresponding weighted value in the step 3, by institute
Have percentage corresponding to keyword by from big to small sort after establish comprehensive grading standard;Wherein described weighted value is bigger by shared hundred
Divide ratio bigger.
As shown in Fig. 2 be a kind of system of letters and calls data mining scoring described in the embodiment of the present invention 2, including:
Abstraction module 1, qualified letters and calls data are extracted from large database concept and are handled, obtain being adapted to data mining
Mining data deposit mining data storehouse in, all history letters and calls data are preserved in the large database concept;
Module 2 is excavated, at least one keyword is extracted to the mining data in mining data storehouse, based on each keyword pair
Mining data carries out feature extraction, obtains the analytical table for each keyword;
Standard establishes module 3, carries out statistical analysis according to the mining data at least one analytical table, obtains for every
One weighted value of individual keyword, comprehensive grading standard is established based on each self-corresponding weighted value of different keywords.
The letters and calls data to be prestored in the large database concept include by data acquisition acquisition mail, Email, voice,
Video and the data such as visiting.
Extracting the process of letters and calls data in the abstraction module 1 from large database concept includes:
In large database concept when there is data to change, the mode of passage time stamp condition or Update log counts from big
The data to be changed according to being extracted in storehouse, obtained data are qualified letters and calls data.
Processing of the abstraction module 1 to letters and calls data includes data scrubbing and data convert;
The data scrubbing obtains the letters and calls data scrubbing of extraction without the standard letters and calls data repeated;
The data, which become, changes commanders standard letters and calls data from transactional data conversion into the mining data of suitable data mining.
The data scrubbing includes duplicate removal, standardized data item and denoising operation, and the duplicate removal will repeat in letters and calls data
The data of typing remove;The standardized data item by the letters and calls data of multi-form typing according to unified standard order recording,
The data after processing are made to be more easy to count;The denoising removes the noise data in letters and calls data.
The process of the data conversion includes the behaviour such as smooth aggregation, Data generalization, standardization, Concept Hierarchies and discretization
Make.
The keyword excavated in module 2 includes too drastic number, letters and calls number, letters and calls number, letters and calls approach number and letter
Visit time-consuming etc..
The standard establishes the percentage that different keyword roots are obtained and integrally scored according to each self-corresponding weighted value in module 3
Than, by percentage corresponding to all keywords by from big to small sort after establish comprehensive grading standard;Wherein described weighted value is got over
Big percentage is bigger.
The system combination that the present invention is scored by a kind of letters and calls data mining of proposition is dispersed in each system and each
Between individual business and mutually isolated all letters and calls data into large database concept, including:Beijing's letters and calls comprehensive office system
Letter, visit to city, anon-normal frequentation, State Bureau visits and the Email of mayor's mailbox;By data acquisition platform from Beijing
City's letters and calls comprehensive office system, mayor's mailbox system extraction letter, visit to city, anon-normal frequentation, State Bureau visit and mayor's mailbox
Letters and calls number of packages evidence, data acquisition platform, which possesses, to be extracted letters and calls data, cleaning letters and calls data, is loaded into letters and calls data to data excavation storehouse
Function.
By the integration process to all letters and calls data, a series of new letters and calls concepts have therefrom been extracted, including:Letters and calls
Item and letters and calls people, too drastic letters and calls item, too drastic letters and calls people, first aggressive behavior, repetition aggressive behavior etc..
Incidence relation will be set up between all letters and calls data by data mining and intellectual analysis, and it is numerous and disorderly from these
The identical letters and calls item of multiservice system extracting data of complexity, identical letters and calls people;Identifying the key feature of same letters and calls people is
Name, address, identification card number (possible nothing), the key feature for identifying same letters and calls item are that letters and calls part sentences re-mark, letters and calls part
Reference identification, letters and calls people and synopsis information.
Key feature is extracted for letters and calls item:Letters and calls number, the average number of letters and calls, letters and calls time, aggressive behavior occur
Time, with the presence or absence of aggressive behavior, classifying content, letters and calls purpose, affiliated area, average age etc., for letters and calls people and letters and calls
The advanced row data signature analysis of key feature of item, data characteristics are analyzed essentially according to classifying content, hot issue, institute possession
Area, average age, income stratum, aggressive behavior whether occurs, whether colony's letters and calls, colony's letters and calls grade (are divided by letters and calls numbers
Level), repeat the dimensions such as letters and calls grade (being classified according to letters and calls number) and be combined analysis, analysis indexes mainly have the volume of letters and
Shi Shouli rates, rate, timely rate of reply are finished in time, combining multiple dimensions, analysis finds data characteristics, data mining also pin together
The colony's letters and calls paid close attention to and aggressive behavior letters and calls event are carried out with deep data characteristics analysis, signature analysis causes me
Grasped the essential characteristic of letters and calls data and related profound data statistic analysis result.
After having basic insight to the data characteristics of letters and calls data, we are targetedly to letters and calls total amount, colony
The volume of letters, repeat the volume of letters, the data characteristics that aggressive behavior the volume of letters this several class pay close attention to data has done correlation analysis, slap
This few class data volumes and promptly accepting rate, timely rate of reply held, have finished between rate, average age, income stratum (annual income)
Dependency relation.
By multiple comparison, sampling, the experiment to these letters and calls data, it is established that letters and calls item comprehensive grading standard bodies
System, realizes a comprehensive grading to letters and calls item and letters and calls people, the features such as according to the order of severity of letters and calls item, urgency level
Extract letters and calls item, the letters and calls people that emphasis need to be paid close attention to.
According to data mining above and intellectual analysis process and letters and calls core business demand, we have grasped letters and calls item
And the data characteristics and correlation statistical analysis situation of letters and calls people, and the whether too drastic of letters and calls item is recognized according to correlation analysis
Behavior, colony's letters and calls rank, repeat letters and calls number rank, which hot issue be characterized in positive correlation or negatively correlated with, from
And excavate and form the letters and calls item order of severity, the core feature of urgency level height correlation, and according to the correlation of these features
Degree analysis draws COMPREHENSIVE CALCULATING each weight, finally draws a calculating letters and calls item comprehensive grading system standard.
As shown in table 1, obtained comprehensive grading standard is shown with specific example, wherein the comprehensive grading of each letters and calls item
Full marks are 100 points, and using bonus point algorithm, basis point is 0 point, specific bonus point item.
The foregoing is only presently preferred embodiments of the present invention, be not intended to limit the invention, it is all the present invention spirit and
Within principle, any modification, equivalent substitution and improvements made etc., it should be included in the scope of the protection.
Claims (2)
- A kind of 1. method of letters and calls data mining scoring, it is characterised in that comprise the following steps:Step 1:Qualified letters and calls data are extracted from large database concept to be handled, and obtain the excavation number for being adapted to data mining According in deposit mining data storehouse, all history letters and calls data are preserved in the large database concept;Step 2:At least one keyword is extracted to the mining data in mining data storehouse, based on each keyword to mining data Feature extraction is carried out, obtains the analytical table for each keyword;Step 3:Statistical analysis is carried out according to the mining data at least one analytical table, obtains one for each keyword Individual weighted value, comprehensive grading standard is established based on each self-corresponding weighted value of different keywords;The letters and calls data to be prestored in the large database concept include mail, Email, voice, the video obtained by data acquisition And visiting data;Extracting the process of letters and calls data in the step 1 from large database concept includes:When there is data to become in large database concept During change, the mode of passage time stamp condition or Update log extracts the data to change from large database concept, obtained number According to for qualified letters and calls data;Processing of the step 1 to letters and calls data includes data scrubbing and data convert;The data scrubbing is by the letters and calls of extraction Data scrubbing is obtained without the standard letters and calls data repeated;The data become change commanders standard letters and calls data from transactional data conversion into It is adapted to the mining data of data mining;The data scrubbing includes duplicate removal, standardized data item and denoising operation, and the duplicate removal will repeat typing in letters and calls data Data remove;The letters and calls data of multi-form typing according to unified standard order recording, are made place by the standardized data item Data after reason are more easy to count;The denoising removes the noise data in letters and calls data;The process of the data conversion includes smooth aggregation, Data generalization, standardization, Concept Hierarchies and discretization operations;Keyword in the step 2 takes including too drastic number, letters and calls number, letters and calls number, letters and calls approach number and letters and calls;The percentage that different keyword roots are obtained and integrally scored according to each self-corresponding weighted value in the step 3, institute is relevant Percentage corresponding to keyword by from big to small sort after establish comprehensive grading standard;The wherein described bigger percentage of weighted value It is bigger.
- A kind of 2. system of letters and calls data mining scoring, it is characterised in that including:Abstraction module, symbol is extracted from large database concept The letters and calls data of conjunction condition are handled, and obtain being adapted in the mining data deposit mining data storehouse of data mining, the big number According to preserving all history letters and calls data in storehouse;Module is excavated, at least one key is extracted to the mining data in mining data storehouse Word, feature extraction is carried out to mining data based on each keyword, obtains the analytical table for each keyword;Standard establishes mould Block, statistical analysis is carried out according to the mining data at least one analytical table, obtains a weight for each keyword Value, comprehensive grading standard is established based on each self-corresponding weighted value of different keywords;The letters and calls data to be prestored in the large database concept include mail, Email, voice, the video obtained by data acquisition And visiting data;Extracting the process of letters and calls data in the abstraction module from large database concept includes:In large database concept when there is data to change, the mode of passage time stamp condition or Update log is from large database concept Middle to extract the data to change, obtained data are qualified letters and calls data;Processing of the abstraction module to letters and calls data includes data scrubbing and data convert;The data scrubbing is by the letter of extraction Data scrubbing is visited to obtain without the standard letters and calls data repeated;The data become standard letters and calls data of changing commanders from transactional data conversion Into the mining data of suitable data mining;The data scrubbing includes duplicate removal, standardized data item and denoising operation, and the duplicate removal will repeat typing in letters and calls data Data remove;The letters and calls data of multi-form typing according to unified standard order recording, are made place by the standardized data item Data after reason are more easy to count;The denoising removes the noise data in letters and calls data;The process of the data conversion includes smooth aggregation, Data generalization, standardization, Concept Hierarchies and discretization operations;Keyword in the excavation module includes too drastic number, letters and calls number, letters and calls number, letters and calls approach number and letters and calls consumption When;The standard establishes the percentage that different keyword roots are obtained and integrally scored according to each self-corresponding weighted value in module, will Percentage corresponding to all keywords by from big to small sort after establish comprehensive grading standard;Wherein described weighted value is bigger shared Percentage is bigger.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610585288.2A CN106294542B (en) | 2016-07-25 | 2016-07-25 | A kind of letters and calls data mining methods of marking and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610585288.2A CN106294542B (en) | 2016-07-25 | 2016-07-25 | A kind of letters and calls data mining methods of marking and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106294542A CN106294542A (en) | 2017-01-04 |
CN106294542B true CN106294542B (en) | 2018-03-30 |
Family
ID=57652139
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610585288.2A Active CN106294542B (en) | 2016-07-25 | 2016-07-25 | A kind of letters and calls data mining methods of marking and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106294542B (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107527289B (en) * | 2017-08-25 | 2021-08-06 | 上海优扬新媒信息技术有限公司 | Investment portfolio industry configuration method, device, server and storage medium |
CN110717045A (en) * | 2019-10-15 | 2020-01-21 | 同方知网(北京)技术有限公司 | Letter element automatic extraction method based on letter overview |
CN112819352A (en) * | 2021-02-07 | 2021-05-18 | 神彩科技股份有限公司 | Environment data processing method and device, electronic equipment and medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103177009A (en) * | 2011-12-22 | 2013-06-26 | 苏州威世博知识产权服务有限公司 | Method and system of supporting automatic update of patent information |
CN103544255A (en) * | 2013-10-15 | 2014-01-29 | 常州大学 | Text semantic relativity based network public opinion information analysis method |
CN105138558A (en) * | 2015-07-22 | 2015-12-09 | 山东大学 | User access content-based real-time personalized information collection method |
CN105701084A (en) * | 2015-12-28 | 2016-06-22 | 广东顺德中山大学卡内基梅隆大学国际联合研究院 | Characteristic extraction method of text classification on the basis of mutual information |
-
2016
- 2016-07-25 CN CN201610585288.2A patent/CN106294542B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103177009A (en) * | 2011-12-22 | 2013-06-26 | 苏州威世博知识产权服务有限公司 | Method and system of supporting automatic update of patent information |
CN103544255A (en) * | 2013-10-15 | 2014-01-29 | 常州大学 | Text semantic relativity based network public opinion information analysis method |
CN105138558A (en) * | 2015-07-22 | 2015-12-09 | 山东大学 | User access content-based real-time personalized information collection method |
CN105701084A (en) * | 2015-12-28 | 2016-06-22 | 广东顺德中山大学卡内基梅隆大学国际联合研究院 | Characteristic extraction method of text classification on the basis of mutual information |
Also Published As
Publication number | Publication date |
---|---|
CN106294542A (en) | 2017-01-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103024746B (en) | System and method for processing spam short messages for telecommunication operator | |
Rowe et al. | Automated social hierarchy detection through email network analysis | |
CN103744928B (en) | A kind of network video classification method based on history access record | |
CN103927398B (en) | The microblogging excavated based on maximum frequent itemsets propagandizes colony's discovery method | |
CN106294542B (en) | A kind of letters and calls data mining methods of marking and system | |
CN101674264B (en) | Spam detection device and method based on user relationship mining and credit evaluation | |
CN107086952A (en) | A kind of Bayesian SPAM Filtering method based on TF IDF Chinese word segmentations | |
Katirai et al. | Filtering junk e-mail | |
CN106131017A (en) | Cloud computing information security visualization system based on trust computing | |
CN103580919B (en) | A kind of method and system that mail user mark is carried out using mail server daily record | |
CN109284626A (en) | Random forests algorithm towards difference secret protection | |
CN103037339A (en) | Short message filtering method based on user creditworthiness and short message spam degree | |
CN102945246B (en) | The disposal route of network information data and device | |
CN109919436A (en) | A kind of promise breaking user's probability forecasting method based on sparse features insertion | |
CN108647730A (en) | A kind of data partition method and system based on historical behavior co-occurrence | |
CN107403007A (en) | A kind of method of network Twitter message reliability discriminant model | |
CN107844914A (en) | Risk management and control system and implementation method based on group management | |
Leão et al. | Evolutionary patterns in the geographic range size of Atlantic Forest plants | |
CN109783805A (en) | A kind of network community user recognition methods and device | |
CN108090787A (en) | A kind of call bill data depth based on Apriori algorithm is excavated and the method for user's behavior prediction | |
CN110611655A (en) | Blacklist screening method and related product | |
CN108920694A (en) | A kind of short text multi-tag classification method and device | |
CN106557983B (en) | Microblog junk user detection method based on fuzzy multi-class SVM | |
Mishra et al. | Analysis of random forest and Naive Bayes for spam mail using feature selection categorization | |
CN104156228B (en) | A kind of embedded feature database of client filtering short message and update method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |