CN107038167A - Big data excavating analysis system and its analysis method based on model evaluation - Google Patents
Big data excavating analysis system and its analysis method based on model evaluation Download PDFInfo
- Publication number
- CN107038167A CN107038167A CN201610077813.XA CN201610077813A CN107038167A CN 107038167 A CN107038167 A CN 107038167A CN 201610077813 A CN201610077813 A CN 201610077813A CN 107038167 A CN107038167 A CN 107038167A
- Authority
- CN
- China
- Prior art keywords
- model
- business model
- business
- analysis
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2465—Query processing support for facilitating data mining operations in structured databases
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Probability & Statistics with Applications (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Fuzzy Systems (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a kind of big data excavating analysis system based on model evaluation, including:Distributed storage management module, the data all to whole big data excavating analysis system carry out unified reference format storage;Business model builds module, and business model is built according to respective algorithms in the demand selection algorithm tool storage room of different industries;Model evaluation module, to the business model that has built by model evaluation index evaluation grade, the optimal business model of selection evaluation grade is used as the system business model;Algorithmic tool storehouse, is provided with the algorithm interface of unified standard, including two kinds of algorithm engine set, for building business model and/or data mining analysis;Mining analysis module, data mining requires the data mining analysis algorithm in selection algorithm tool storage room.The invention also discloses a kind of big data mining analysis method based on model evaluation.
Description
Technical field
The present invention relates to computer realm, more particularly to a kind of big data excavating analysis system based on model evaluation.This
Invention further relates to a kind of big data mining analysis method based on model evaluation.
Background technology
The development of big data technology is swift and violent, and data technique handles the data of single type from early stage on unit, develops into current
The data of polymorphic type are handled on computer cluster, time loose data analysis application is realized.As data volume is developed into
It is PB, EB grades even more big, and require faster Treatment Analysis time, big data special-purpose computer, strange land distribution meter
When the analysis of the complex types of data such as calculation machine cluster, the processing of polymorphic type multi-source data and analysis, data network, second level
Between general technology and the various domain-orienteds such as analysis application technology be big data technology development trend.With HDFS,
GFS, MapReduce, Hadoop, Spark, Storm, HBase, MongoDB etc. are the big data general technology of representative
With open source projects fast development, big data preconditioning technique is an essential link in big data processing procedure.
Data mining be exactly from substantial amounts of, incomplete, noisy, fuzzy, random real application data,
By algorithm, it is ignorant in advance but be the information of potentially useful and the mistake of knowledge that extraction lies in therein, people
Journey.It is mainly based upon machine learning, statistics, neutral net and all multi-methods of database to realize above-mentioned target.
At present, in big data mining analysis method, researcher builds fixed model according to business datum, then basis
Model carries out mining analysis to data, but is modeled not according to business demand;Artificial experience is not bound with, from
Adapt to business model;Model is not estimated, after setting grade, intelligent selection is carried out to model.Due to big
Data mining is that, towards conglomerate, multi-field, the quality of constructed model often influences data mining analysis result
Accuracy of analysis, it is difficult to support towards industry field decision support.Accordingly, it would be desirable to according to business model adjust automatically,
And the big data mining analysis method being estimated to model.
The content of the invention
The technical problem to be solved in the present invention be technical problems to be solved in this application be to provide it is a kind of towards conglomerate, it is many
Big data excavating analysis system of the business demand in field based on model evaluation.Commented present invention also offers one kind based on model
The big data mining analysis method estimated.
In order to solve the above technical problems, the big data excavating analysis system based on model evaluation that the present invention is provided, including:
Distributed storage management module, business model build module, model evaluation module, algorithmic tool storehouse and mining analysis module;
Distributed storage management module, the data all to whole big data excavating analysis system carry out unified reference format
Storage;The data of storage at least include:Sample data, test data, data to be analyzed, analysis result, artificial experience
Storehouse, algorithmic tool storehouse and business model storehouse etc..
Sample Storehouse, the typical historical data that business personnel accumulates in data analysis process;
Artificial experience storehouse, the data analysis empirical conversion that business personnel's long-term work is drawn is the number that computer can recognize that
According to.
Business model builds module, according to artificial experience storehouse, the data item in sample data is extracted, to different data item
It is combined computing and forms data set, then by data set according to business demand selection algorithm parameter and algorithm, builds corresponding
Business model.
Model evaluation module, the business model to having built obtains by model evaluation index evaluation grade and has built business
The evaluation grade of model, the optimal business model of selection evaluation grade is used as the system business model;Wherein, model evaluation
Index comment including:The computational efficiency of business model and the accuracy of analysis of business model;
Algorithmic tool storehouse, is provided with the algorithm interface of unified standard, including two kinds of algorithm engine set, for building business
Model and/or data mining analysis;Wherein, each algorithm engine set at least includes a kind of algorithm engine, each algorithm
Engine includes at least one algorithm;
Mining analysis module, is required according to the data mining of different industries, the data mining analysis in selection algorithm tool storage room
Algorithm.
Wherein, the algorithmic tool storehouse, including:Building business model algorithm engine set includes:Sorting algorithm engine,
Social network analysis algorithm engine and/or pattern algorithm engine;Mining analysis engine set includes:Situation type analysis algorithm draws
Hold up, early warning type analysis algorithm engine and/or assess type analysis algorithm engine.
Sorting algorithm engine, when building for business model with mining analysis request, uses one kind in sorting algorithm engine
Or several algorithms are cooperated, the common classification for completing batch of data.By the one or more algorithms for comparing selection
The accuracy of analysis of corresponding model analysis result in model evaluation system, weight is set to each algorithm, and to classification
As a result it is weighted, so as to be analyzed by polyalgorithm batch of data, is effectively improved classification accuracy.
Conventional sorting algorithm has SVMs (SVM) sorting algorithm, Bayes (Bayes) sorting algorithm, artificial neuron
Network (ANN).
When social network analysis algorithm engine is used for business model structure and mining analysis request, Social Network Analysis Method & is used
Relation between analyze data;Its general principle of the algorithm that social network analysis algorithm engine includes is all based on community network
Analytic approach, mainly by the relationship analysis between data and data, using data as between the node of figure, data and data
Relation community network figure is built as the side of figure.
When pattern algorithm engine is used for mining analysis request, using the placement algorithm and parser in pattern algorithm engine,
Mining analysis result is carried out to show analysis;Pattern algorithm engine mainly showed by way of figure data and data it
Between relation, and the relation between the node and node in figure is parsed with parser.Conventional pattern algorithm
There are placement algorithm, cluster algorithm.
Situation type analysis engine has statistical model and analysis to drill through two kinds of forms of model.Statistical model passes through the orientation to data
Aggregate statistics compare information and Tendency Prediction function with statistical report form there is provided the same period.Situation distribution engine can quickly,
Efficiently realize the dynamic data association statistics of trans-sectoral business data class.Analysis drills through model and belongs to dynamic solid multidimensional data
Exhibition method, by the cross-polymerization drilling analysis to data by information with dynamic multidimensional form mode and multi-C stereo figure
Mode shows, and provides multidimensional and compare and situation orientating function.
Early warning type analysis engine, by all kinds of business information of comprehensive analysis, determines that strategy and tactics, guiding are global, reaches pre-
The effect of alert prompting.Early warning type analysis engine includes two big parts:Analysis model is managed and knowledge base maintenance, to every
One group of data result carries out probability and ratio is calculated, so as to draw a conclusion and issue.
Type analysis engine is assessed, current shape is accurately evaluated by historical data and present situation and peripheral objective condition
Formula and prediction next period tendency.It is to be based on association analysis algorithm and Principle of Statistics to assess type analysis engine, in Various types of data
Between set up close incidence relation, and depict that its relation walks power curve and formula is calculated.
Mining analysis module, is required according to the data mining of different industries, the data mining analysis in selection algorithm tool storage room
Algorithm.
Wherein, the business model builds module, and construction business model is in the following ways:
S201:According to business demand, required sample data is extracted from the data of unified standard form storage;
S202:According to artificial experience storehouse, the data item in sample data is extracted, computing is combined to different data item,
And from algorithmic tool storehouse selection algorithm parameter, and selection algorithm;
S203:According to selected algorithm parameter and algorithm, different data parameters are selected, at least one business mould is built
Type;
S204:Constructed business model, improves Sample Storehouse by the typical data in extraction and analysis result and data is present
Feature constantly adjustment and sophisticated model, until business model evaluation grade reaches pre-set business model evaluation grade threshold
Value.
Wherein, the algorithm parameter includes:Data path to be sorted, model path, result path, substantially threshold value, member
Element, node, weights;
Wherein, the model evaluation module, assesses business model in the following ways:
S301:Incoming traffic model, test data is extracted from the data of unified standard form storage;
S302:The computational efficiency of performance test data test business model, and compared with artificial experience storehouse, analyze business mould
The accuracy of analysis of type;
Model evaluation index according to having set comments acquisition business model evaluation grade, according to the computational efficiency of business model and
Accuracy of analysis is that business model assesses its grade;
S303:In the business model storehouse for selecting optimal business model to be stored according to business model evaluation grade;If institute's structure
The evaluation grade for the business model built can not reach default evaluation grade threshold value, then rebuild business model.
Wherein, the business model evaluation grade is:
A, very well, the accuracy of analysis by business model are that the computational efficiency of business model is rated more than or equal to 85%
a;
B, good, the accuracy of analysis by business model is to be less than 85%, the computational efficiency of business model more than or equal to 80%
It is rated b;
C, typically, the accuracy of analysis by business model are to be less than 85% more than or equal to 80%, and the calculating of business model is imitated
Rate is rated c;
D, poor, the accuracy of analysis by business model is that the computational efficiency of business model is rated c less than 80%.
Acquisition is compared by the result for analyzing network analysis result and business personnel in the accuracy of analysis of business model;
The computational efficiency grading of business model tests number by different business models using identical calculations condition and identical
Obtained according to calculating;Computational efficiency is graded, and can be set according to business demand, design conditions, data volume and/or calculating time etc..
Design conditions are primarily referred to as:Hardware computing capability (computer hardware configuration), the quantity of Distributed Calculation distributed point, net
Network transmission speed etc..
Wherein, evaluation grade threshold value is B.
A kind of big data mining analysis method based on model evaluation that the present invention is provided, comprises the following steps:
1st step, carries out unified reference format by original big data and stores;
2nd step, the data stored to unified standard form select respective algorithms to build business mould according to the demand of different industries
Type;
3rd step, is obtained by model evaluation index evaluation grade to the business model built and has built commenting for business model
Estimate grade, the optimal business model of selection evaluation grade is used as the system business model;Model evaluation index comment including:Industry
The computational efficiency of model of being engaged in and the accuracy of analysis of business model;
4th step, forms the algorithm interface algorithmic tool storehouse for being provided with unified standard, and algorithmic tool sets at least two in storehouse
Algorithm engine set, for building business model and/or data mining analysis;Wherein, each algorithm engine set is at least
Including a kind of algorithm engine, each algorithm engine includes at least one algorithm;
5th step, is required according to the data mining of different industries, data mining analysis algorithm in selection algorithm tool storage room.
Wherein, the algorithmic tool storehouse, including:Building business model algorithm engine set includes:Sorting algorithm engine,
Social network analysis algorithm engine and/or pattern algorithm engine;Mining analysis engine set includes:Situation type analysis algorithm draws
Hold up, early warning type analysis algorithm engine and/or assess type analysis algorithm engine.
Wherein, construction business model is in the following ways:
S201:According to business demand, required sample data is extracted from the data of unified standard form storage;
S202:According to artificial experience storehouse, the data item in sample data is extracted, computing is combined to different data item,
And from algorithmic tool storehouse selection algorithm parameter, and selection algorithm;
S203:According to selected algorithm parameter and algorithm, different data parameters are selected, at least one business mould is built
Type;
S204:Constructed business model, improves Sample Storehouse by the typical data in extraction and analysis result and data is present
Feature constantly adjustment and sophisticated model, until business model evaluation grade reaches pre-set business model evaluation grade threshold
Value.
Wherein, the algorithm parameter includes:Data path to be sorted, model path, result path, substantially threshold value, member
Element, node, weights;
Wherein, business model is assessed in the following ways:
S301:Incoming traffic model, test data is extracted from the data of unified standard form storage;
S302:The computational efficiency of performance test data test business model, and compared with artificial experience storehouse, analyze business mould
The accuracy of analysis of type;
Model evaluation index according to having set comments acquisition business model evaluation grade, according to the computational efficiency of business model and
Accuracy of analysis is that business model assesses its grade;
S303:In the business model storehouse for selecting optimal business model to be stored according to business model evaluation grade;If institute's structure
The evaluation grade for the business model built can not reach default evaluation grade threshold value, then rebuild business model.
Wherein, the business model evaluation grade is:
A, very well, the accuracy of analysis by business model are that the computational efficiency of business model is rated more than or equal to 85%
a;
B, good, the accuracy of analysis by business model is to be less than 85%, the computational efficiency of business model more than or equal to 80%
It is rated b;
C, typically, the accuracy of analysis by business model are to be less than 85% more than or equal to 80%, and the calculating of business model is imitated
Rate is rated c;
D, poor, the accuracy of analysis by business model is that the computational efficiency of business model is rated c less than 80%.
Acquisition is compared by the result for analyzing network analysis result and business personnel in the accuracy of analysis of business model;
The computational efficiency grading of business model tests number by different business models using identical calculations condition and identical
Obtained according to calculating;Computational efficiency is graded, and can be set according to business demand, design conditions, data volume and/or calculating time etc..
Wherein, evaluation grade threshold value is B.
Big data mining analysis method and system of the invention based on model evaluation.Needed towards conglomerate, multi-field business
Ask, according to the feature of sample data, at least one algorithm of selection and algorithm parameter are modeled, with reference to artificial experience storehouse,
Constantly adjust and improve business model.The evaluation criterias such as computational efficiency and analysis accuracy of analysis by business model are right
Institute's established model carries out model evaluation, automatically, intelligently selects optimal business model.Then corresponding at least one is called
Plant mining analysis engine and carry out data mining analysis, realize the support system in Industry-oriented field.
Brief description of the drawings
The present invention is further detailed explanation with embodiment below in conjunction with the accompanying drawings:
Fig. 1 is the structural representation of big data excavating analysis system of the present invention.
Fig. 2 is that business model of the present invention builds schematic flow sheet.
Fig. 3 is business model estimation flow schematic diagram of the present invention.
Fig. 4 is the structural representation of inventive algorithm tool storage room.
Fig. 5 is mining analysis engine collecting structure schematic diagram.
Embodiment
The big data excavating analysis system based on model evaluation that the present invention is provided, including:Distributed storage management module,
Business model builds module, model evaluation module, algorithmic tool storehouse and mining analysis module;
Distributed storage management module, the data all to whole big data excavating analysis system carry out unified reference format
Storage;The data of storage at least include:Sample data, test data, data to be analyzed, analysis result, artificial experience
Storehouse, algorithmic tool storehouse and business model storehouse.
Business model builds module, according to artificial experience storehouse, the data item in sample data is extracted, to different data item
It is combined computing and forms data set, then by data set according to business demand selection algorithm parameter and algorithm, builds corresponding
Business model.The structure of model is constantly adjustment, a process for constantly improve, and the later stage can be by extraction and analysis result
Typical data improve feature that Sample Storehouse and data are present constantly adjustment and sophisticated model, until model is optimal.
Model evaluation module, the business model to having built obtains by model evaluation index evaluation grade and has built business
The evaluation grade of model, the optimal business model of selection evaluation grade is used as the system business model;Wherein, model evaluation
Index comment including:The computational efficiency of business model and the accuracy of analysis of business model;
Algorithmic tool storehouse, is provided with the algorithm interface of unified standard, including two kinds of algorithm engine set, for building business
Model and/or data mining analysis;Wherein, each algorithm engine set at least includes a kind of algorithm engine, each algorithm
Engine includes at least one algorithm;
Mining analysis module, is required according to the data mining of different industries, the data mining analysis in selection algorithm tool storage room
Algorithm.
Wherein, the algorithmic tool storehouse, including:Building business model algorithm engine set includes:Sorting algorithm engine,
Social network analysis algorithm engine and/or pattern algorithm engine;Mining analysis engine set includes:Situation type analysis algorithm draws
Hold up, early warning type analysis algorithm engine and/or assess type analysis algorithm engine.
Mining analysis module, is required according to the data mining of different industries, the data mining analysis in selection algorithm tool storage room
Algorithm.
Wherein, the business model builds module, and construction business model is in the following ways:
S201:According to business demand, required sample data is extracted from the data of unified standard form storage;
S202:According to artificial experience storehouse, the data item in sample data is extracted, computing is combined to different data item,
And from algorithmic tool storehouse selection algorithm parameter, and selection algorithm;
S203:According to selected algorithm parameter and algorithm, different data parameters are selected, at least one business mould is built
Type;
S204:Constructed business model, improves Sample Storehouse by the typical data in extraction and analysis result and data is present
Feature constantly adjustment and sophisticated model, until business model evaluation grade reaches pre-set business model evaluation grade threshold
Value.
Wherein, the algorithm parameter includes:Data path to be sorted, model path, result path, substantially threshold value, member
Element, node, weights;
The algorithm includes sorting algorithm, social network analysis algorithm, pattern algorithm.
Wherein, the model evaluation module, assesses business model in the following ways:
S301:Incoming traffic model, test data is extracted from the data of unified standard form storage;
S302:The computational efficiency of performance test data test business model, and compared with artificial experience storehouse, analyze business mould
The accuracy of analysis of type;
Model evaluation index according to having set comments acquisition business model evaluation grade, according to the computational efficiency of business model and
Accuracy of analysis is that business model assesses its grade;
S303:In the business model storehouse for selecting optimal business model to be stored according to business model evaluation grade;If institute's structure
The evaluation grade for the business model built can not reach default evaluation grade threshold value, then rebuild business model.
Wherein, the business model evaluation grade is:
A, very well, the accuracy of analysis by business model are that the computational efficiency of business model is rated more than or equal to 85%
a;
B, good, the accuracy of analysis by business model is to be less than 85%, the computational efficiency of business model more than or equal to 80%
It is rated b;
C, typically, the accuracy of analysis by business model are to be less than 85% more than or equal to 80%, and the calculating of business model is imitated
Rate is rated c;
D, poor, the accuracy of analysis by business model is that the computational efficiency of business model is rated c less than 80%.
Acquisition is compared by the result for analyzing network analysis result and business personnel in the accuracy of analysis of business model;
The computational efficiency grading of business model, number is tested by different business models using identical calculations condition and identical
Obtained according to calculating;Computational efficiency is graded, and can be set according to business demand, design conditions, data volume and/or calculating time etc..
Hereby lifting a possible embodiments is used to illustrate, the grading setting of business model computational efficiency.
Using identical calculations condition and same test data, different business model is tested, its business model meter is evaluated
Efficiency grading is calculated, for example:
A, data volume is 104-105When, take 1-5 hours;
B, data volume is 104-105When, take 5-10 hours;
C, data volume is 104-105When, take and be more than 10 hours.
That is, design conditions are identical, and test data is identical, and data volume is identical, pass through evaluation assignment model the time required to calculating
Computational efficiency is graded.Equally, it would however also be possible to employ design conditions are identical, the testing time is identical, can calculate different pieces of information amount
Carry out computational efficiency grading.
Wherein, evaluation grade threshold value is B.
A kind of big data mining analysis method based on model evaluation that the present invention is provided, comprises the following steps:
1st step, carries out unified reference format by original big data and stores;
2nd step, the data stored to unified standard form select respective algorithms to build business mould according to the demand of different industries
Type;
3rd step, is obtained by model evaluation index evaluation grade to the business model built and has built commenting for business model
Estimate grade, the optimal business model of selection evaluation grade is used as the system business model;Model evaluation index comment including:Industry
The computational efficiency of model of being engaged in and the accuracy of analysis of business model;
4th step, forms the algorithm interface algorithmic tool storehouse for being provided with unified standard, and algorithmic tool sets at least two in storehouse
Algorithm engine set, for building business model and/or data mining analysis;Wherein, each algorithm engine set is at least
Including a kind of algorithm engine, each algorithm engine includes at least one algorithm;
5th step, is required according to the data mining of different industries, data mining analysis algorithm in selection algorithm tool storage room.
Wherein, the algorithmic tool storehouse, including:Building business model algorithm engine set includes:Sorting algorithm engine,
Social network analysis algorithm engine and/or pattern algorithm engine;Mining analysis engine set includes:Situation type analysis algorithm draws
Hold up, early warning type analysis algorithm engine and/or assess type analysis algorithm engine.
Wherein, construction business model is in the following ways:
S201:According to business demand, required sample data is extracted from the data of unified standard form storage;
S202:According to artificial experience storehouse, the data item in sample data is extracted, computing is combined to different data item,
And from algorithmic tool storehouse selection algorithm parameter, and selection algorithm;
S203:According to selected algorithm parameter and algorithm, different data parameters are selected, at least one business mould is built
Type;
S204:Constructed business model, improves Sample Storehouse by the typical data in extraction and analysis result and data is present
Feature constantly adjustment and sophisticated model, until business model evaluation grade reaches pre-set business model evaluation grade threshold
Value.
Wherein, the algorithm parameter includes:Data path to be sorted, model path, result path, substantially threshold value, member
Element, node, weights;
The algorithm includes sorting algorithm, social network analysis algorithm, pattern algorithm.
Wherein, business model is assessed in the following ways:
S301:Incoming traffic model, test data is extracted from the data of unified standard form storage;
S302:The computational efficiency of performance test data test business model, and compared with artificial experience storehouse, analyze business mould
The accuracy of analysis of type;
Model evaluation index according to having set comments acquisition business model evaluation grade, according to the computational efficiency of business model and
Accuracy of analysis is that business model assesses its grade;
S303:In the business model storehouse for selecting optimal business model to be stored according to business model evaluation grade;If institute's structure
The evaluation grade for the business model built can not reach default evaluation grade threshold value, then rebuild business model.
Wherein, the business model evaluation grade is:
A, very well, the accuracy of analysis by business model are that the computational efficiency of business model is rated more than or equal to 85%
a;
B, good, the accuracy of analysis by business model is to be less than 85%, the computational efficiency of business model more than or equal to 80%
It is rated b;
C, typically, the accuracy of analysis by business model are to be less than 85% more than or equal to 80%, and the calculating of business model is imitated
Rate is rated c;
D, poor, the accuracy of analysis by business model is that the computational efficiency of business model is rated c less than 80%.
Acquisition is compared by the result for analyzing network analysis result and business personnel in the accuracy of analysis of business model;
The computational efficiency grading of business model, number is tested by different business models using identical calculations condition and identical
According to acquisition;Computational efficiency is graded, and is set according to business demand, design conditions, data volume, calculating time etc..Wherein,
Evaluation grade threshold value is B.
The present invention is described in detail above by embodiment and embodiment, but these are not constituted pair
The limitation of the present invention.Without departing from the principles of the present invention, those skilled in the art can also make many deformations
And improvement, these also should be regarded as protection scope of the present invention.
Claims (10)
1. a kind of big data excavating analysis system based on model evaluation, it is characterized in that, including:Distributed storage management mould
Block, business model build module, model evaluation module, algorithmic tool storehouse and mining analysis module;
Distributed storage management module, the data all to whole big data excavating analysis system carry out unified reference format
Storage;
Business model builds module, according to artificial experience storehouse, the data item in sample data is extracted, to different data item
It is combined computing and forms data set, then by data set according to business demand selection algorithm parameter and algorithm, builds corresponding
Business model.
Model evaluation module, the business model to having built obtains by model evaluation index evaluation grade and has built business
The evaluation grade of model, the optimal business model of selection evaluation grade is used as the system business model;Wherein, model evaluation
Index comment including:The computational efficiency of business model and the accuracy of analysis of business model;
Algorithmic tool storehouse, is provided with the algorithm interface of unified standard, including two kinds of algorithm engine set, for building business
Model and/or data mining analysis;Wherein, each algorithm engine set at least includes a kind of algorithm engine, each calculation
Method engine includes at least one algorithm;
Mining analysis module, is required according to the data mining of different industries, the data mining analysis in selection algorithm tool storage room
Algorithm.
2. the big data excavating analysis system as claimed in claim 1 based on model evaluation, it is characterized in that:The algorithm work
Have storehouse, including:Building business model algorithm engine set includes:Sorting algorithm engine, social network analysis algorithm engine
And/or pattern algorithm engine;Mining analysis engine set includes:Situation type analysis algorithm engine, early warning type analysis algorithm draw
Hold up and/or assess type analysis algorithm engine.
3. the big data excavating analysis system as claimed in claim 1 based on model evaluation, it is characterized in that:The business mould
Type builds module, and construction business model is in the following ways:
S201:According to business demand, required sample data is extracted from distributed storage management module;
S202:According to artificial experience storehouse, the data item in sample data is extracted, computing is combined to different data item,
And from algorithmic tool storehouse selection algorithm parameter, and selection algorithm;
S203:According to selected algorithm parameter and algorithm, different data parameters are selected, at least one business mould is built
Type;
S204:Constructed business model, analyzes the typical data consistent with manual analysis result by extraction system and is stored in
Sample Storehouse and then Sample Storehouse is improved, the step of constantly repeating to build business model adjusts and improved business model, until
Business model evaluation grade reaches pre-set business model evaluation grade threshold.
4. the big data excavating analysis system as claimed in claim 3 based on model evaluation, it is characterized in that:The algorithm ginseng
Number includes:Data path to be sorted, model path, result path, threshold value, basic element, node, weights.
5. the big data excavating analysis system as claimed in claim 1 based on model evaluation, it is characterized in that:The model is commented
Estimate module, assess business model in the following ways:
S301:Incoming traffic model, test data is extracted from distributed storage management module;
S302:The computational efficiency of performance test data test business model, and compared with artificial experience storehouse progress result, point
Analyse the accuracy of analysis of business model;
Model evaluation index according to having set comments acquisition business model evaluation grade, according to the computational efficiency of business model and
Accuracy of analysis is that business model assesses its grade;
S303:The industry of distributed storage management module is stored according to the optimal business model of business model evaluation grade selection
In model library of being engaged in;If the evaluation grade of constructed business model can not reach default evaluation grade threshold value, rebuild
Business model.
6. the big data excavating analysis system as claimed in claim 5 based on model evaluation, it is characterized in that:The business mould
Type evaluation grade is:
A, very well, the accuracy of analysis by business model are that the computational efficiency of business model is rated more than or equal to 85%
a;
B, good, the accuracy of analysis by business model is to be less than 85%, the computational efficiency of business model more than or equal to 80%
It is rated b;
C, typically, the accuracy of analysis by business model are to be less than 85% more than or equal to 80%, and the calculating of business model is imitated
Rate is rated c;
D, poor, the accuracy of analysis by business model is that the computational efficiency of business model is rated c less than 80%.
Acquisition is compared by the result for analyzing network analysis result and business personnel in the accuracy of analysis of business model;
The computational efficiency grading of business model tests number by different business models using identical calculations condition and identical
Obtained according to calculating;Computational efficiency is graded, and can be set according to business demand, design conditions.
7. the big data excavating analysis system as claimed in claim 6 based on model evaluation, it is characterized in that:Business model is commented
Grade threshold is estimated for B.
8. a kind of big data mining analysis method based on model evaluation, it is characterized in that, comprise the following steps:
1st step, carries out unified reference format by original big data and stores;
2nd step, the data stored to unified standard form select respective algorithms to build business mould according to the demand of different industries
Type;
3rd step, is obtained by model evaluation index evaluation grade to the business model built and has built commenting for business model
Estimate grade, the optimal business model of selection evaluation grade is used as the system business model;Model evaluation index comment including:Industry
The computational efficiency of model of being engaged in and the accuracy of analysis of business model;
4th step, forms the algorithm interface algorithmic tool storehouse for being provided with unified standard, and algorithmic tool sets at least two in storehouse
Algorithm engine set, for building business model and/or data mining analysis;Wherein, each algorithm engine set is at least
Including a kind of algorithm engine, each algorithm engine includes at least one algorithm;
5th step, is required according to the data mining of different industries, data mining analysis algorithm in selection algorithm tool storage room.
9. the big data mining analysis method as claimed in claim 8 based on model evaluation, it is characterized in that:Construction business mould
Type is in the following ways:
S201:According to business demand, required sample data is extracted from the data of unified standard form storage;
S202:According to artificial experience storehouse, the data item in sample data is extracted, computing is combined to different data item,
And from algorithmic tool storehouse selection algorithm parameter, and selection algorithm;
S203:According to selected algorithm parameter and algorithm, different data parameters are selected, at least one business mould is built
Type;
S204:Constructed business model, improves Sample Storehouse by the typical data in extraction and analysis result and data is present
Feature constantly adjustment and sophisticated model, until business model evaluation grade reaches pre-set business model evaluation grade threshold
Value.
10. the big data mining analysis method as claimed in claim 8 based on model evaluation, it is characterized in that:Assessment business mould
Type is in the following ways:
S301:Incoming traffic model, test data is extracted from the data of unified standard form storage;
S302:The computational efficiency of performance test data test business model, and compared with artificial experience storehouse, analyze business mould
The accuracy of analysis of type;
Model evaluation index according to having set comments acquisition business model evaluation grade, according to the computational efficiency of business model and
Accuracy of analysis is that business model assesses its grade;
S303:In the business model storehouse for selecting optimal business model to be stored according to business model evaluation grade;If institute's structure
The evaluation grade for the business model built can not reach default evaluation grade threshold value, then rebuild business model.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610077813.XA CN107038167A (en) | 2016-02-03 | 2016-02-03 | Big data excavating analysis system and its analysis method based on model evaluation |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610077813.XA CN107038167A (en) | 2016-02-03 | 2016-02-03 | Big data excavating analysis system and its analysis method based on model evaluation |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107038167A true CN107038167A (en) | 2017-08-11 |
Family
ID=59532739
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610077813.XA Withdrawn CN107038167A (en) | 2016-02-03 | 2016-02-03 | Big data excavating analysis system and its analysis method based on model evaluation |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107038167A (en) |
Cited By (27)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107832440A (en) * | 2017-11-17 | 2018-03-23 | 北京锐安科技有限公司 | A kind of data digging method, device, server and computer-readable recording medium |
CN107832429A (en) * | 2017-11-14 | 2018-03-23 | 广州供电局有限公司 | audit data processing method and system |
CN107919983A (en) * | 2017-11-01 | 2018-04-17 | 中国科学院软件研究所 | A kind of space information network Effectiveness Evaluation System and method based on data mining |
CN107943986A (en) * | 2017-11-30 | 2018-04-20 | 睿视智觉(深圳)算法技术有限公司 | A kind of big data analysis digging system |
CN108417270A (en) * | 2018-03-01 | 2018-08-17 | 刘恩 | Data model method for building up, data analysing method and data model establish device |
CN108509644A (en) * | 2018-04-12 | 2018-09-07 | 成都优易数据有限公司 | A kind of data digging method having model pre-warning update mechanism |
CN108664605A (en) * | 2018-05-09 | 2018-10-16 | 北京三快在线科技有限公司 | A kind of model evaluation method and system |
CN109389143A (en) * | 2018-06-19 | 2019-02-26 | 北京九章云极科技有限公司 | A kind of Data Analysis Services system and method for automatic modeling |
WO2019052339A1 (en) * | 2017-09-13 | 2019-03-21 | 深圳市宇数科技有限公司 | Data exploration management method and system, electronic device, and storage medium |
CN109523316A (en) * | 2018-11-16 | 2019-03-26 | 杭州珞珈数据科技有限公司 | The automation modeling method of commerce services model |
CN109783062A (en) * | 2019-01-14 | 2019-05-21 | 中国科学院软件研究所 | A kind of machine learning application and development method and system of people in circuit |
CN109960699A (en) * | 2019-03-19 | 2019-07-02 | 广州供电局有限公司 | Data statistics method for building up, device, computer equipment and storage medium |
CN110322143A (en) * | 2019-06-28 | 2019-10-11 | 深圳前海微众银行股份有限公司 | Model entity management method, device, equipment and computer storage medium |
CN110378564A (en) * | 2019-06-18 | 2019-10-25 | 中国平安财产保险股份有限公司 | Monitoring model generation method, device, terminal device and storage medium |
CN110427398A (en) * | 2018-04-28 | 2019-11-08 | 北京资采信息技术有限公司 | A kind of model management tool based on data mining and analysis |
CN111523084A (en) * | 2020-04-09 | 2020-08-11 | 京东方科技集团股份有限公司 | Service data prediction method and device, electronic equipment and computer readable storage medium |
CN111724028A (en) * | 2020-05-08 | 2020-09-29 | 中海创科技(福建)集团有限公司 | Machine equipment operation analysis and mining system based on big data technology |
CN111796840A (en) * | 2019-04-09 | 2020-10-20 | Oppo广东移动通信有限公司 | Algorithm model updating method and device, storage medium and electronic equipment |
CN111949436A (en) * | 2020-08-10 | 2020-11-17 | 星辰天合(北京)数据科技有限公司 | Test data verification method, verification device and computer readable storage medium |
CN112085396A (en) * | 2020-09-14 | 2020-12-15 | 洛阳众智软件科技股份有限公司 | Algorithm model configuration method based on state-of-the-earth space planning current situation evaluation index |
CN113139759A (en) * | 2021-05-19 | 2021-07-20 | 杭州市电力设计院有限公司余杭分公司 | Power grid data asset management method and system |
CN113516514A (en) * | 2021-07-21 | 2021-10-19 | 福建天晴数码有限公司 | Method and system for paying data mining value by user |
CN114091253A (en) * | 2021-11-22 | 2022-02-25 | 国网宁夏电力有限公司电力科学研究院 | Electromagnetic environment intelligent analysis method based on big data |
CN114360045A (en) * | 2020-10-14 | 2022-04-15 | 百度(美国)有限责任公司 | Method, storage medium and detection device for operating detection device |
CN114596061A (en) * | 2022-03-02 | 2022-06-07 | 穗保(广州)科技有限公司 | Project data management method and system based on big data |
WO2022225481A1 (en) * | 2021-04-19 | 2022-10-27 | İzmi̇r Ekonomi̇ Üni̇versi̇tesi̇ | Creativity readiness level (crrl) determination method and business model assessment system |
CN118098476A (en) * | 2024-04-28 | 2024-05-28 | 山东新时代药业有限公司 | Electronic medical record qualification degree evaluation method and system based on big data |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101110089A (en) * | 2007-09-04 | 2008-01-23 | 华为技术有限公司 | Method and system for data digging and model building |
CN101620691A (en) * | 2008-06-30 | 2010-01-06 | 上海全成通信技术有限公司 | Automatic data mining platform in telecommunications industry |
CN101621823A (en) * | 2008-06-30 | 2010-01-06 | 上海全成通信技术有限公司 | Method for accurately building customer portrait of mobile communication data service |
CN102693317A (en) * | 2012-05-29 | 2012-09-26 | 华为软件技术有限公司 | Method and device for data mining process generating |
-
2016
- 2016-02-03 CN CN201610077813.XA patent/CN107038167A/en not_active Withdrawn
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101110089A (en) * | 2007-09-04 | 2008-01-23 | 华为技术有限公司 | Method and system for data digging and model building |
CN101620691A (en) * | 2008-06-30 | 2010-01-06 | 上海全成通信技术有限公司 | Automatic data mining platform in telecommunications industry |
CN101621823A (en) * | 2008-06-30 | 2010-01-06 | 上海全成通信技术有限公司 | Method for accurately building customer portrait of mobile communication data service |
CN102693317A (en) * | 2012-05-29 | 2012-09-26 | 华为软件技术有限公司 | Method and device for data mining process generating |
Cited By (35)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019052339A1 (en) * | 2017-09-13 | 2019-03-21 | 深圳市宇数科技有限公司 | Data exploration management method and system, electronic device, and storage medium |
CN107919983B (en) * | 2017-11-01 | 2020-07-10 | 中国科学院软件研究所 | A system and method for evaluating the effectiveness of space-based information network based on data mining |
CN107919983A (en) * | 2017-11-01 | 2018-04-17 | 中国科学院软件研究所 | A kind of space information network Effectiveness Evaluation System and method based on data mining |
CN107832429A (en) * | 2017-11-14 | 2018-03-23 | 广州供电局有限公司 | audit data processing method and system |
CN107832440B (en) * | 2017-11-17 | 2020-10-13 | 北京锐安科技有限公司 | Data mining method, device, server and computer readable storage medium |
CN107832440A (en) * | 2017-11-17 | 2018-03-23 | 北京锐安科技有限公司 | A kind of data digging method, device, server and computer-readable recording medium |
CN107943986A (en) * | 2017-11-30 | 2018-04-20 | 睿视智觉(深圳)算法技术有限公司 | A kind of big data analysis digging system |
CN107943986B (en) * | 2017-11-30 | 2022-05-17 | 睿视智觉(深圳)算法技术有限公司 | Big data analysis mining system |
CN108417270A (en) * | 2018-03-01 | 2018-08-17 | 刘恩 | Data model method for building up, data analysing method and data model establish device |
CN108509644A (en) * | 2018-04-12 | 2018-09-07 | 成都优易数据有限公司 | A kind of data digging method having model pre-warning update mechanism |
CN110427398A (en) * | 2018-04-28 | 2019-11-08 | 北京资采信息技术有限公司 | A kind of model management tool based on data mining and analysis |
CN108664605A (en) * | 2018-05-09 | 2018-10-16 | 北京三快在线科技有限公司 | A kind of model evaluation method and system |
CN109389143A (en) * | 2018-06-19 | 2019-02-26 | 北京九章云极科技有限公司 | A kind of Data Analysis Services system and method for automatic modeling |
CN113935434A (en) * | 2018-06-19 | 2022-01-14 | 北京九章云极科技有限公司 | Data analysis processing system and automatic modeling method |
CN109523316A (en) * | 2018-11-16 | 2019-03-26 | 杭州珞珈数据科技有限公司 | The automation modeling method of commerce services model |
CN109783062A (en) * | 2019-01-14 | 2019-05-21 | 中国科学院软件研究所 | A kind of machine learning application and development method and system of people in circuit |
CN109783062B (en) * | 2019-01-14 | 2020-10-09 | 中国科学院软件研究所 | A human-in-the-loop machine learning application development method and system |
CN109960699A (en) * | 2019-03-19 | 2019-07-02 | 广州供电局有限公司 | Data statistics method for building up, device, computer equipment and storage medium |
CN109960699B (en) * | 2019-03-19 | 2021-11-02 | 广东电网有限责任公司广州供电局 | Data statistics establishing method and device, computer equipment and storage medium |
CN111796840A (en) * | 2019-04-09 | 2020-10-20 | Oppo广东移动通信有限公司 | Algorithm model updating method and device, storage medium and electronic equipment |
CN110378564A (en) * | 2019-06-18 | 2019-10-25 | 中国平安财产保险股份有限公司 | Monitoring model generation method, device, terminal device and storage medium |
CN110322143A (en) * | 2019-06-28 | 2019-10-11 | 深圳前海微众银行股份有限公司 | Model entity management method, device, equipment and computer storage medium |
CN110322143B (en) * | 2019-06-28 | 2023-03-24 | 深圳前海微众银行股份有限公司 | Model materialization management method, device, equipment and computer storage medium |
CN111523084A (en) * | 2020-04-09 | 2020-08-11 | 京东方科技集团股份有限公司 | Service data prediction method and device, electronic equipment and computer readable storage medium |
CN111724028A (en) * | 2020-05-08 | 2020-09-29 | 中海创科技(福建)集团有限公司 | Machine equipment operation analysis and mining system based on big data technology |
CN111949436A (en) * | 2020-08-10 | 2020-11-17 | 星辰天合(北京)数据科技有限公司 | Test data verification method, verification device and computer readable storage medium |
CN112085396A (en) * | 2020-09-14 | 2020-12-15 | 洛阳众智软件科技股份有限公司 | Algorithm model configuration method based on state-of-the-earth space planning current situation evaluation index |
CN114360045A (en) * | 2020-10-14 | 2022-04-15 | 百度(美国)有限责任公司 | Method, storage medium and detection device for operating detection device |
WO2022225481A1 (en) * | 2021-04-19 | 2022-10-27 | İzmi̇r Ekonomi̇ Üni̇versi̇tesi̇ | Creativity readiness level (crrl) determination method and business model assessment system |
CN113139759A (en) * | 2021-05-19 | 2021-07-20 | 杭州市电力设计院有限公司余杭分公司 | Power grid data asset management method and system |
CN113139759B (en) * | 2021-05-19 | 2024-06-04 | 杭州市电力设计院有限公司余杭分公司 | Power grid data asset management method and system |
CN113516514A (en) * | 2021-07-21 | 2021-10-19 | 福建天晴数码有限公司 | Method and system for paying data mining value by user |
CN114091253A (en) * | 2021-11-22 | 2022-02-25 | 国网宁夏电力有限公司电力科学研究院 | Electromagnetic environment intelligent analysis method based on big data |
CN114596061A (en) * | 2022-03-02 | 2022-06-07 | 穗保(广州)科技有限公司 | Project data management method and system based on big data |
CN118098476A (en) * | 2024-04-28 | 2024-05-28 | 山东新时代药业有限公司 | Electronic medical record qualification degree evaluation method and system based on big data |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107038167A (en) | Big data excavating analysis system and its analysis method based on model evaluation | |
CN115578015A (en) | The whole process supervision method, system and storage medium of sewage treatment based on Internet of Things | |
CN107274105B (en) | Linear discriminant analysis-based multi-attribute decision tree power grid stability margin evaluation method | |
CN105740975B (en) | A Method for Evaluation and Prediction of Equipment Defects Based on Data Correlation | |
CN111340063B (en) | Data anomaly detection method for coal mill | |
CN108985380B (en) | A fault identification method of switch machine based on cluster integration | |
CN105701596A (en) | Method for lean distribution network emergency maintenance and management system based on big data technology | |
CN106372799B (en) | A Grid Security Risk Prediction Method | |
CN109636171A (en) | A kind of comprehensive diagnos and risk evaluating method that regional vegetation restores | |
CN104865827B (en) | Oil pumping unit oil extraction optimization method based on multi-working-condition model | |
CN101826090A (en) | WEB public opinion trend forecasting method based on optimal model | |
CN108876163A (en) | The transient rotor angle stability fast evaluation method of comprehensive causality analysis and machine learning | |
CN104090985A (en) | Active disconnection optimum fracture surface searching method based on electrical distance | |
CN109934456A (en) | A method and system for intelligent fault detection of an acquisition, operation and maintenance system | |
CN109033178A (en) | A method of excavating Granger causality between visibility multidimensional space-time data | |
CN118647092B (en) | Comprehensive management method and system for power distribution communication network | |
CN109358608A (en) | A kind of transformer state methods of risk assessment and device based on integrated study | |
CN110097141A (en) | A kind of acquisition operational system intelligent trouble detection method | |
CN109492699A (en) | Passway for transmitting electricity method for three-dimensional measurement and device | |
CN112231971A (en) | A blast furnace fault diagnosis method based on relative overall trend diffusion fault sample generation | |
CN116307844A (en) | A Method for Evaluating and Analyzing Line Loss in Low-Voltage Station Area | |
CN108123436B (en) | Prediction Model of Voltage Over-Limit Based on Principal Component Analysis and Multiple Regression Algorithm | |
CN117076454B (en) | Engineering quality acceptance form data structured storage method and system | |
CN111209158A (en) | Mining monitoring method and cluster monitoring system for server cluster | |
CN117290405A (en) | Internet of things system for quickly inquiring large-scale equipment data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WW01 | Invention patent application withdrawn after publication | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20170811 |