CN107038167A - Big data excavating analysis system and its analysis method based on model evaluation - Google Patents

Big data excavating analysis system and its analysis method based on model evaluation Download PDF

Info

Publication number
CN107038167A
CN107038167A CN201610077813.XA CN201610077813A CN107038167A CN 107038167 A CN107038167 A CN 107038167A CN 201610077813 A CN201610077813 A CN 201610077813A CN 107038167 A CN107038167 A CN 107038167A
Authority
CN
China
Prior art keywords
model
business model
business
analysis
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN201610077813.XA
Other languages
Chinese (zh)
Inventor
顾青
梁佐泉
谢超
梁艳敏
王宁宁
冯四风
赵艳红
田文晋
王亚红
黄奚芳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Waterhouse Integrity Information Technology Co Ltd
Original Assignee
Waterhouse Integrity Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Waterhouse Integrity Information Technology Co Ltd filed Critical Waterhouse Integrity Information Technology Co Ltd
Priority to CN201610077813.XA priority Critical patent/CN107038167A/en
Publication of CN107038167A publication Critical patent/CN107038167A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Fuzzy Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of big data excavating analysis system based on model evaluation, including:Distributed storage management module, the data all to whole big data excavating analysis system carry out unified reference format storage;Business model builds module, and business model is built according to respective algorithms in the demand selection algorithm tool storage room of different industries;Model evaluation module, to the business model that has built by model evaluation index evaluation grade, the optimal business model of selection evaluation grade is used as the system business model;Algorithmic tool storehouse, is provided with the algorithm interface of unified standard, including two kinds of algorithm engine set, for building business model and/or data mining analysis;Mining analysis module, data mining requires the data mining analysis algorithm in selection algorithm tool storage room.The invention also discloses a kind of big data mining analysis method based on model evaluation.

Description

Big data excavating analysis system and its analysis method based on model evaluation
Technical field
The present invention relates to computer realm, more particularly to a kind of big data excavating analysis system based on model evaluation.This Invention further relates to a kind of big data mining analysis method based on model evaluation.
Background technology
The development of big data technology is swift and violent, and data technique handles the data of single type from early stage on unit, develops into current The data of polymorphic type are handled on computer cluster, time loose data analysis application is realized.As data volume is developed into It is PB, EB grades even more big, and require faster Treatment Analysis time, big data special-purpose computer, strange land distribution meter When the analysis of the complex types of data such as calculation machine cluster, the processing of polymorphic type multi-source data and analysis, data network, second level Between general technology and the various domain-orienteds such as analysis application technology be big data technology development trend.With HDFS, GFS, MapReduce, Hadoop, Spark, Storm, HBase, MongoDB etc. are the big data general technology of representative With open source projects fast development, big data preconditioning technique is an essential link in big data processing procedure.
Data mining be exactly from substantial amounts of, incomplete, noisy, fuzzy, random real application data, By algorithm, it is ignorant in advance but be the information of potentially useful and the mistake of knowledge that extraction lies in therein, people Journey.It is mainly based upon machine learning, statistics, neutral net and all multi-methods of database to realize above-mentioned target.
At present, in big data mining analysis method, researcher builds fixed model according to business datum, then basis Model carries out mining analysis to data, but is modeled not according to business demand;Artificial experience is not bound with, from Adapt to business model;Model is not estimated, after setting grade, intelligent selection is carried out to model.Due to big Data mining is that, towards conglomerate, multi-field, the quality of constructed model often influences data mining analysis result Accuracy of analysis, it is difficult to support towards industry field decision support.Accordingly, it would be desirable to according to business model adjust automatically, And the big data mining analysis method being estimated to model.
The content of the invention
The technical problem to be solved in the present invention be technical problems to be solved in this application be to provide it is a kind of towards conglomerate, it is many Big data excavating analysis system of the business demand in field based on model evaluation.Commented present invention also offers one kind based on model The big data mining analysis method estimated.
In order to solve the above technical problems, the big data excavating analysis system based on model evaluation that the present invention is provided, including: Distributed storage management module, business model build module, model evaluation module, algorithmic tool storehouse and mining analysis module;
Distributed storage management module, the data all to whole big data excavating analysis system carry out unified reference format Storage;The data of storage at least include:Sample data, test data, data to be analyzed, analysis result, artificial experience Storehouse, algorithmic tool storehouse and business model storehouse etc..
Sample Storehouse, the typical historical data that business personnel accumulates in data analysis process;
Artificial experience storehouse, the data analysis empirical conversion that business personnel's long-term work is drawn is the number that computer can recognize that According to.
Business model builds module, according to artificial experience storehouse, the data item in sample data is extracted, to different data item It is combined computing and forms data set, then by data set according to business demand selection algorithm parameter and algorithm, builds corresponding Business model.
Model evaluation module, the business model to having built obtains by model evaluation index evaluation grade and has built business The evaluation grade of model, the optimal business model of selection evaluation grade is used as the system business model;Wherein, model evaluation Index comment including:The computational efficiency of business model and the accuracy of analysis of business model;
Algorithmic tool storehouse, is provided with the algorithm interface of unified standard, including two kinds of algorithm engine set, for building business Model and/or data mining analysis;Wherein, each algorithm engine set at least includes a kind of algorithm engine, each algorithm Engine includes at least one algorithm;
Mining analysis module, is required according to the data mining of different industries, the data mining analysis in selection algorithm tool storage room Algorithm.
Wherein, the algorithmic tool storehouse, including:Building business model algorithm engine set includes:Sorting algorithm engine, Social network analysis algorithm engine and/or pattern algorithm engine;Mining analysis engine set includes:Situation type analysis algorithm draws Hold up, early warning type analysis algorithm engine and/or assess type analysis algorithm engine.
Sorting algorithm engine, when building for business model with mining analysis request, uses one kind in sorting algorithm engine Or several algorithms are cooperated, the common classification for completing batch of data.By the one or more algorithms for comparing selection The accuracy of analysis of corresponding model analysis result in model evaluation system, weight is set to each algorithm, and to classification As a result it is weighted, so as to be analyzed by polyalgorithm batch of data, is effectively improved classification accuracy. Conventional sorting algorithm has SVMs (SVM) sorting algorithm, Bayes (Bayes) sorting algorithm, artificial neuron Network (ANN).
When social network analysis algorithm engine is used for business model structure and mining analysis request, Social Network Analysis Method & is used Relation between analyze data;Its general principle of the algorithm that social network analysis algorithm engine includes is all based on community network Analytic approach, mainly by the relationship analysis between data and data, using data as between the node of figure, data and data Relation community network figure is built as the side of figure.
When pattern algorithm engine is used for mining analysis request, using the placement algorithm and parser in pattern algorithm engine, Mining analysis result is carried out to show analysis;Pattern algorithm engine mainly showed by way of figure data and data it Between relation, and the relation between the node and node in figure is parsed with parser.Conventional pattern algorithm There are placement algorithm, cluster algorithm.
Situation type analysis engine has statistical model and analysis to drill through two kinds of forms of model.Statistical model passes through the orientation to data Aggregate statistics compare information and Tendency Prediction function with statistical report form there is provided the same period.Situation distribution engine can quickly, Efficiently realize the dynamic data association statistics of trans-sectoral business data class.Analysis drills through model and belongs to dynamic solid multidimensional data Exhibition method, by the cross-polymerization drilling analysis to data by information with dynamic multidimensional form mode and multi-C stereo figure Mode shows, and provides multidimensional and compare and situation orientating function.
Early warning type analysis engine, by all kinds of business information of comprehensive analysis, determines that strategy and tactics, guiding are global, reaches pre- The effect of alert prompting.Early warning type analysis engine includes two big parts:Analysis model is managed and knowledge base maintenance, to every One group of data result carries out probability and ratio is calculated, so as to draw a conclusion and issue.
Type analysis engine is assessed, current shape is accurately evaluated by historical data and present situation and peripheral objective condition Formula and prediction next period tendency.It is to be based on association analysis algorithm and Principle of Statistics to assess type analysis engine, in Various types of data Between set up close incidence relation, and depict that its relation walks power curve and formula is calculated.
Mining analysis module, is required according to the data mining of different industries, the data mining analysis in selection algorithm tool storage room Algorithm.
Wherein, the business model builds module, and construction business model is in the following ways:
S201:According to business demand, required sample data is extracted from the data of unified standard form storage;
S202:According to artificial experience storehouse, the data item in sample data is extracted, computing is combined to different data item, And from algorithmic tool storehouse selection algorithm parameter, and selection algorithm;
S203:According to selected algorithm parameter and algorithm, different data parameters are selected, at least one business mould is built Type;
S204:Constructed business model, improves Sample Storehouse by the typical data in extraction and analysis result and data is present Feature constantly adjustment and sophisticated model, until business model evaluation grade reaches pre-set business model evaluation grade threshold Value.
Wherein, the algorithm parameter includes:Data path to be sorted, model path, result path, substantially threshold value, member Element, node, weights;
Wherein, the model evaluation module, assesses business model in the following ways:
S301:Incoming traffic model, test data is extracted from the data of unified standard form storage;
S302:The computational efficiency of performance test data test business model, and compared with artificial experience storehouse, analyze business mould The accuracy of analysis of type;
Model evaluation index according to having set comments acquisition business model evaluation grade, according to the computational efficiency of business model and Accuracy of analysis is that business model assesses its grade;
S303:In the business model storehouse for selecting optimal business model to be stored according to business model evaluation grade;If institute's structure The evaluation grade for the business model built can not reach default evaluation grade threshold value, then rebuild business model.
Wherein, the business model evaluation grade is:
A, very well, the accuracy of analysis by business model are that the computational efficiency of business model is rated more than or equal to 85% a;
B, good, the accuracy of analysis by business model is to be less than 85%, the computational efficiency of business model more than or equal to 80% It is rated b;
C, typically, the accuracy of analysis by business model are to be less than 85% more than or equal to 80%, and the calculating of business model is imitated Rate is rated c;
D, poor, the accuracy of analysis by business model is that the computational efficiency of business model is rated c less than 80%.
Acquisition is compared by the result for analyzing network analysis result and business personnel in the accuracy of analysis of business model;
The computational efficiency grading of business model tests number by different business models using identical calculations condition and identical Obtained according to calculating;Computational efficiency is graded, and can be set according to business demand, design conditions, data volume and/or calculating time etc.. Design conditions are primarily referred to as:Hardware computing capability (computer hardware configuration), the quantity of Distributed Calculation distributed point, net Network transmission speed etc..
Wherein, evaluation grade threshold value is B.
A kind of big data mining analysis method based on model evaluation that the present invention is provided, comprises the following steps:
1st step, carries out unified reference format by original big data and stores;
2nd step, the data stored to unified standard form select respective algorithms to build business mould according to the demand of different industries Type;
3rd step, is obtained by model evaluation index evaluation grade to the business model built and has built commenting for business model Estimate grade, the optimal business model of selection evaluation grade is used as the system business model;Model evaluation index comment including:Industry The computational efficiency of model of being engaged in and the accuracy of analysis of business model;
4th step, forms the algorithm interface algorithmic tool storehouse for being provided with unified standard, and algorithmic tool sets at least two in storehouse Algorithm engine set, for building business model and/or data mining analysis;Wherein, each algorithm engine set is at least Including a kind of algorithm engine, each algorithm engine includes at least one algorithm;
5th step, is required according to the data mining of different industries, data mining analysis algorithm in selection algorithm tool storage room.
Wherein, the algorithmic tool storehouse, including:Building business model algorithm engine set includes:Sorting algorithm engine, Social network analysis algorithm engine and/or pattern algorithm engine;Mining analysis engine set includes:Situation type analysis algorithm draws Hold up, early warning type analysis algorithm engine and/or assess type analysis algorithm engine.
Wherein, construction business model is in the following ways:
S201:According to business demand, required sample data is extracted from the data of unified standard form storage;
S202:According to artificial experience storehouse, the data item in sample data is extracted, computing is combined to different data item, And from algorithmic tool storehouse selection algorithm parameter, and selection algorithm;
S203:According to selected algorithm parameter and algorithm, different data parameters are selected, at least one business mould is built Type;
S204:Constructed business model, improves Sample Storehouse by the typical data in extraction and analysis result and data is present Feature constantly adjustment and sophisticated model, until business model evaluation grade reaches pre-set business model evaluation grade threshold Value.
Wherein, the algorithm parameter includes:Data path to be sorted, model path, result path, substantially threshold value, member Element, node, weights;
Wherein, business model is assessed in the following ways:
S301:Incoming traffic model, test data is extracted from the data of unified standard form storage;
S302:The computational efficiency of performance test data test business model, and compared with artificial experience storehouse, analyze business mould The accuracy of analysis of type;
Model evaluation index according to having set comments acquisition business model evaluation grade, according to the computational efficiency of business model and Accuracy of analysis is that business model assesses its grade;
S303:In the business model storehouse for selecting optimal business model to be stored according to business model evaluation grade;If institute's structure The evaluation grade for the business model built can not reach default evaluation grade threshold value, then rebuild business model.
Wherein, the business model evaluation grade is:
A, very well, the accuracy of analysis by business model are that the computational efficiency of business model is rated more than or equal to 85% a;
B, good, the accuracy of analysis by business model is to be less than 85%, the computational efficiency of business model more than or equal to 80% It is rated b;
C, typically, the accuracy of analysis by business model are to be less than 85% more than or equal to 80%, and the calculating of business model is imitated Rate is rated c;
D, poor, the accuracy of analysis by business model is that the computational efficiency of business model is rated c less than 80%.
Acquisition is compared by the result for analyzing network analysis result and business personnel in the accuracy of analysis of business model;
The computational efficiency grading of business model tests number by different business models using identical calculations condition and identical Obtained according to calculating;Computational efficiency is graded, and can be set according to business demand, design conditions, data volume and/or calculating time etc.. Wherein, evaluation grade threshold value is B.
Big data mining analysis method and system of the invention based on model evaluation.Needed towards conglomerate, multi-field business Ask, according to the feature of sample data, at least one algorithm of selection and algorithm parameter are modeled, with reference to artificial experience storehouse, Constantly adjust and improve business model.The evaluation criterias such as computational efficiency and analysis accuracy of analysis by business model are right Institute's established model carries out model evaluation, automatically, intelligently selects optimal business model.Then corresponding at least one is called Plant mining analysis engine and carry out data mining analysis, realize the support system in Industry-oriented field.
Brief description of the drawings
The present invention is further detailed explanation with embodiment below in conjunction with the accompanying drawings:
Fig. 1 is the structural representation of big data excavating analysis system of the present invention.
Fig. 2 is that business model of the present invention builds schematic flow sheet.
Fig. 3 is business model estimation flow schematic diagram of the present invention.
Fig. 4 is the structural representation of inventive algorithm tool storage room.
Fig. 5 is mining analysis engine collecting structure schematic diagram.
Embodiment
The big data excavating analysis system based on model evaluation that the present invention is provided, including:Distributed storage management module, Business model builds module, model evaluation module, algorithmic tool storehouse and mining analysis module;
Distributed storage management module, the data all to whole big data excavating analysis system carry out unified reference format Storage;The data of storage at least include:Sample data, test data, data to be analyzed, analysis result, artificial experience Storehouse, algorithmic tool storehouse and business model storehouse.
Business model builds module, according to artificial experience storehouse, the data item in sample data is extracted, to different data item It is combined computing and forms data set, then by data set according to business demand selection algorithm parameter and algorithm, builds corresponding Business model.The structure of model is constantly adjustment, a process for constantly improve, and the later stage can be by extraction and analysis result Typical data improve feature that Sample Storehouse and data are present constantly adjustment and sophisticated model, until model is optimal.
Model evaluation module, the business model to having built obtains by model evaluation index evaluation grade and has built business The evaluation grade of model, the optimal business model of selection evaluation grade is used as the system business model;Wherein, model evaluation Index comment including:The computational efficiency of business model and the accuracy of analysis of business model;
Algorithmic tool storehouse, is provided with the algorithm interface of unified standard, including two kinds of algorithm engine set, for building business Model and/or data mining analysis;Wherein, each algorithm engine set at least includes a kind of algorithm engine, each algorithm Engine includes at least one algorithm;
Mining analysis module, is required according to the data mining of different industries, the data mining analysis in selection algorithm tool storage room Algorithm.
Wherein, the algorithmic tool storehouse, including:Building business model algorithm engine set includes:Sorting algorithm engine, Social network analysis algorithm engine and/or pattern algorithm engine;Mining analysis engine set includes:Situation type analysis algorithm draws Hold up, early warning type analysis algorithm engine and/or assess type analysis algorithm engine.
Mining analysis module, is required according to the data mining of different industries, the data mining analysis in selection algorithm tool storage room Algorithm.
Wherein, the business model builds module, and construction business model is in the following ways:
S201:According to business demand, required sample data is extracted from the data of unified standard form storage;
S202:According to artificial experience storehouse, the data item in sample data is extracted, computing is combined to different data item, And from algorithmic tool storehouse selection algorithm parameter, and selection algorithm;
S203:According to selected algorithm parameter and algorithm, different data parameters are selected, at least one business mould is built Type;
S204:Constructed business model, improves Sample Storehouse by the typical data in extraction and analysis result and data is present Feature constantly adjustment and sophisticated model, until business model evaluation grade reaches pre-set business model evaluation grade threshold Value.
Wherein, the algorithm parameter includes:Data path to be sorted, model path, result path, substantially threshold value, member Element, node, weights;
The algorithm includes sorting algorithm, social network analysis algorithm, pattern algorithm.
Wherein, the model evaluation module, assesses business model in the following ways:
S301:Incoming traffic model, test data is extracted from the data of unified standard form storage;
S302:The computational efficiency of performance test data test business model, and compared with artificial experience storehouse, analyze business mould The accuracy of analysis of type;
Model evaluation index according to having set comments acquisition business model evaluation grade, according to the computational efficiency of business model and Accuracy of analysis is that business model assesses its grade;
S303:In the business model storehouse for selecting optimal business model to be stored according to business model evaluation grade;If institute's structure The evaluation grade for the business model built can not reach default evaluation grade threshold value, then rebuild business model.
Wherein, the business model evaluation grade is:
A, very well, the accuracy of analysis by business model are that the computational efficiency of business model is rated more than or equal to 85% a;
B, good, the accuracy of analysis by business model is to be less than 85%, the computational efficiency of business model more than or equal to 80% It is rated b;
C, typically, the accuracy of analysis by business model are to be less than 85% more than or equal to 80%, and the calculating of business model is imitated Rate is rated c;
D, poor, the accuracy of analysis by business model is that the computational efficiency of business model is rated c less than 80%.
Acquisition is compared by the result for analyzing network analysis result and business personnel in the accuracy of analysis of business model;
The computational efficiency grading of business model, number is tested by different business models using identical calculations condition and identical Obtained according to calculating;Computational efficiency is graded, and can be set according to business demand, design conditions, data volume and/or calculating time etc.. Hereby lifting a possible embodiments is used to illustrate, the grading setting of business model computational efficiency.
Using identical calculations condition and same test data, different business model is tested, its business model meter is evaluated Efficiency grading is calculated, for example:
A, data volume is 104-105When, take 1-5 hours;
B, data volume is 104-105When, take 5-10 hours;
C, data volume is 104-105When, take and be more than 10 hours.
That is, design conditions are identical, and test data is identical, and data volume is identical, pass through evaluation assignment model the time required to calculating Computational efficiency is graded.Equally, it would however also be possible to employ design conditions are identical, the testing time is identical, can calculate different pieces of information amount Carry out computational efficiency grading.
Wherein, evaluation grade threshold value is B.
A kind of big data mining analysis method based on model evaluation that the present invention is provided, comprises the following steps:
1st step, carries out unified reference format by original big data and stores;
2nd step, the data stored to unified standard form select respective algorithms to build business mould according to the demand of different industries Type;
3rd step, is obtained by model evaluation index evaluation grade to the business model built and has built commenting for business model Estimate grade, the optimal business model of selection evaluation grade is used as the system business model;Model evaluation index comment including:Industry The computational efficiency of model of being engaged in and the accuracy of analysis of business model;
4th step, forms the algorithm interface algorithmic tool storehouse for being provided with unified standard, and algorithmic tool sets at least two in storehouse Algorithm engine set, for building business model and/or data mining analysis;Wherein, each algorithm engine set is at least Including a kind of algorithm engine, each algorithm engine includes at least one algorithm;
5th step, is required according to the data mining of different industries, data mining analysis algorithm in selection algorithm tool storage room.
Wherein, the algorithmic tool storehouse, including:Building business model algorithm engine set includes:Sorting algorithm engine, Social network analysis algorithm engine and/or pattern algorithm engine;Mining analysis engine set includes:Situation type analysis algorithm draws Hold up, early warning type analysis algorithm engine and/or assess type analysis algorithm engine.
Wherein, construction business model is in the following ways:
S201:According to business demand, required sample data is extracted from the data of unified standard form storage;
S202:According to artificial experience storehouse, the data item in sample data is extracted, computing is combined to different data item, And from algorithmic tool storehouse selection algorithm parameter, and selection algorithm;
S203:According to selected algorithm parameter and algorithm, different data parameters are selected, at least one business mould is built Type;
S204:Constructed business model, improves Sample Storehouse by the typical data in extraction and analysis result and data is present Feature constantly adjustment and sophisticated model, until business model evaluation grade reaches pre-set business model evaluation grade threshold Value.
Wherein, the algorithm parameter includes:Data path to be sorted, model path, result path, substantially threshold value, member Element, node, weights;
The algorithm includes sorting algorithm, social network analysis algorithm, pattern algorithm.
Wherein, business model is assessed in the following ways:
S301:Incoming traffic model, test data is extracted from the data of unified standard form storage;
S302:The computational efficiency of performance test data test business model, and compared with artificial experience storehouse, analyze business mould The accuracy of analysis of type;
Model evaluation index according to having set comments acquisition business model evaluation grade, according to the computational efficiency of business model and Accuracy of analysis is that business model assesses its grade;
S303:In the business model storehouse for selecting optimal business model to be stored according to business model evaluation grade;If institute's structure The evaluation grade for the business model built can not reach default evaluation grade threshold value, then rebuild business model.
Wherein, the business model evaluation grade is:
A, very well, the accuracy of analysis by business model are that the computational efficiency of business model is rated more than or equal to 85% a;
B, good, the accuracy of analysis by business model is to be less than 85%, the computational efficiency of business model more than or equal to 80% It is rated b;
C, typically, the accuracy of analysis by business model are to be less than 85% more than or equal to 80%, and the calculating of business model is imitated Rate is rated c;
D, poor, the accuracy of analysis by business model is that the computational efficiency of business model is rated c less than 80%.
Acquisition is compared by the result for analyzing network analysis result and business personnel in the accuracy of analysis of business model;
The computational efficiency grading of business model, number is tested by different business models using identical calculations condition and identical According to acquisition;Computational efficiency is graded, and is set according to business demand, design conditions, data volume, calculating time etc..Wherein, Evaluation grade threshold value is B.
The present invention is described in detail above by embodiment and embodiment, but these are not constituted pair The limitation of the present invention.Without departing from the principles of the present invention, those skilled in the art can also make many deformations And improvement, these also should be regarded as protection scope of the present invention.

Claims (10)

1. a kind of big data excavating analysis system based on model evaluation, it is characterized in that, including:Distributed storage management mould Block, business model build module, model evaluation module, algorithmic tool storehouse and mining analysis module;
Distributed storage management module, the data all to whole big data excavating analysis system carry out unified reference format Storage;
Business model builds module, according to artificial experience storehouse, the data item in sample data is extracted, to different data item It is combined computing and forms data set, then by data set according to business demand selection algorithm parameter and algorithm, builds corresponding Business model.
Model evaluation module, the business model to having built obtains by model evaluation index evaluation grade and has built business The evaluation grade of model, the optimal business model of selection evaluation grade is used as the system business model;Wherein, model evaluation Index comment including:The computational efficiency of business model and the accuracy of analysis of business model;
Algorithmic tool storehouse, is provided with the algorithm interface of unified standard, including two kinds of algorithm engine set, for building business Model and/or data mining analysis;Wherein, each algorithm engine set at least includes a kind of algorithm engine, each calculation Method engine includes at least one algorithm;
Mining analysis module, is required according to the data mining of different industries, the data mining analysis in selection algorithm tool storage room Algorithm.
2. the big data excavating analysis system as claimed in claim 1 based on model evaluation, it is characterized in that:The algorithm work Have storehouse, including:Building business model algorithm engine set includes:Sorting algorithm engine, social network analysis algorithm engine And/or pattern algorithm engine;Mining analysis engine set includes:Situation type analysis algorithm engine, early warning type analysis algorithm draw Hold up and/or assess type analysis algorithm engine.
3. the big data excavating analysis system as claimed in claim 1 based on model evaluation, it is characterized in that:The business mould Type builds module, and construction business model is in the following ways:
S201:According to business demand, required sample data is extracted from distributed storage management module;
S202:According to artificial experience storehouse, the data item in sample data is extracted, computing is combined to different data item, And from algorithmic tool storehouse selection algorithm parameter, and selection algorithm;
S203:According to selected algorithm parameter and algorithm, different data parameters are selected, at least one business mould is built Type;
S204:Constructed business model, analyzes the typical data consistent with manual analysis result by extraction system and is stored in Sample Storehouse and then Sample Storehouse is improved, the step of constantly repeating to build business model adjusts and improved business model, until Business model evaluation grade reaches pre-set business model evaluation grade threshold.
4. the big data excavating analysis system as claimed in claim 3 based on model evaluation, it is characterized in that:The algorithm ginseng Number includes:Data path to be sorted, model path, result path, threshold value, basic element, node, weights.
5. the big data excavating analysis system as claimed in claim 1 based on model evaluation, it is characterized in that:The model is commented Estimate module, assess business model in the following ways:
S301:Incoming traffic model, test data is extracted from distributed storage management module;
S302:The computational efficiency of performance test data test business model, and compared with artificial experience storehouse progress result, point Analyse the accuracy of analysis of business model;
Model evaluation index according to having set comments acquisition business model evaluation grade, according to the computational efficiency of business model and Accuracy of analysis is that business model assesses its grade;
S303:The industry of distributed storage management module is stored according to the optimal business model of business model evaluation grade selection In model library of being engaged in;If the evaluation grade of constructed business model can not reach default evaluation grade threshold value, rebuild Business model.
6. the big data excavating analysis system as claimed in claim 5 based on model evaluation, it is characterized in that:The business mould Type evaluation grade is:
A, very well, the accuracy of analysis by business model are that the computational efficiency of business model is rated more than or equal to 85% a;
B, good, the accuracy of analysis by business model is to be less than 85%, the computational efficiency of business model more than or equal to 80% It is rated b;
C, typically, the accuracy of analysis by business model are to be less than 85% more than or equal to 80%, and the calculating of business model is imitated Rate is rated c;
D, poor, the accuracy of analysis by business model is that the computational efficiency of business model is rated c less than 80%.
Acquisition is compared by the result for analyzing network analysis result and business personnel in the accuracy of analysis of business model;
The computational efficiency grading of business model tests number by different business models using identical calculations condition and identical Obtained according to calculating;Computational efficiency is graded, and can be set according to business demand, design conditions.
7. the big data excavating analysis system as claimed in claim 6 based on model evaluation, it is characterized in that:Business model is commented Grade threshold is estimated for B.
8. a kind of big data mining analysis method based on model evaluation, it is characterized in that, comprise the following steps:
1st step, carries out unified reference format by original big data and stores;
2nd step, the data stored to unified standard form select respective algorithms to build business mould according to the demand of different industries Type;
3rd step, is obtained by model evaluation index evaluation grade to the business model built and has built commenting for business model Estimate grade, the optimal business model of selection evaluation grade is used as the system business model;Model evaluation index comment including:Industry The computational efficiency of model of being engaged in and the accuracy of analysis of business model;
4th step, forms the algorithm interface algorithmic tool storehouse for being provided with unified standard, and algorithmic tool sets at least two in storehouse Algorithm engine set, for building business model and/or data mining analysis;Wherein, each algorithm engine set is at least Including a kind of algorithm engine, each algorithm engine includes at least one algorithm;
5th step, is required according to the data mining of different industries, data mining analysis algorithm in selection algorithm tool storage room.
9. the big data mining analysis method as claimed in claim 8 based on model evaluation, it is characterized in that:Construction business mould Type is in the following ways:
S201:According to business demand, required sample data is extracted from the data of unified standard form storage;
S202:According to artificial experience storehouse, the data item in sample data is extracted, computing is combined to different data item, And from algorithmic tool storehouse selection algorithm parameter, and selection algorithm;
S203:According to selected algorithm parameter and algorithm, different data parameters are selected, at least one business mould is built Type;
S204:Constructed business model, improves Sample Storehouse by the typical data in extraction and analysis result and data is present Feature constantly adjustment and sophisticated model, until business model evaluation grade reaches pre-set business model evaluation grade threshold Value.
10. the big data mining analysis method as claimed in claim 8 based on model evaluation, it is characterized in that:Assessment business mould Type is in the following ways:
S301:Incoming traffic model, test data is extracted from the data of unified standard form storage;
S302:The computational efficiency of performance test data test business model, and compared with artificial experience storehouse, analyze business mould The accuracy of analysis of type;
Model evaluation index according to having set comments acquisition business model evaluation grade, according to the computational efficiency of business model and Accuracy of analysis is that business model assesses its grade;
S303:In the business model storehouse for selecting optimal business model to be stored according to business model evaluation grade;If institute's structure The evaluation grade for the business model built can not reach default evaluation grade threshold value, then rebuild business model.
CN201610077813.XA 2016-02-03 2016-02-03 Big data excavating analysis system and its analysis method based on model evaluation Withdrawn CN107038167A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610077813.XA CN107038167A (en) 2016-02-03 2016-02-03 Big data excavating analysis system and its analysis method based on model evaluation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610077813.XA CN107038167A (en) 2016-02-03 2016-02-03 Big data excavating analysis system and its analysis method based on model evaluation

Publications (1)

Publication Number Publication Date
CN107038167A true CN107038167A (en) 2017-08-11

Family

ID=59532739

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610077813.XA Withdrawn CN107038167A (en) 2016-02-03 2016-02-03 Big data excavating analysis system and its analysis method based on model evaluation

Country Status (1)

Country Link
CN (1) CN107038167A (en)

Cited By (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107832440A (en) * 2017-11-17 2018-03-23 北京锐安科技有限公司 A kind of data digging method, device, server and computer-readable recording medium
CN107832429A (en) * 2017-11-14 2018-03-23 广州供电局有限公司 audit data processing method and system
CN107919983A (en) * 2017-11-01 2018-04-17 中国科学院软件研究所 A kind of space information network Effectiveness Evaluation System and method based on data mining
CN107943986A (en) * 2017-11-30 2018-04-20 睿视智觉(深圳)算法技术有限公司 A kind of big data analysis digging system
CN108417270A (en) * 2018-03-01 2018-08-17 刘恩 Data model method for building up, data analysing method and data model establish device
CN108509644A (en) * 2018-04-12 2018-09-07 成都优易数据有限公司 A kind of data digging method having model pre-warning update mechanism
CN108664605A (en) * 2018-05-09 2018-10-16 北京三快在线科技有限公司 A kind of model evaluation method and system
CN109389143A (en) * 2018-06-19 2019-02-26 北京九章云极科技有限公司 A kind of Data Analysis Services system and method for automatic modeling
WO2019052339A1 (en) * 2017-09-13 2019-03-21 深圳市宇数科技有限公司 Data exploration management method and system, electronic device, and storage medium
CN109523316A (en) * 2018-11-16 2019-03-26 杭州珞珈数据科技有限公司 The automation modeling method of commerce services model
CN109783062A (en) * 2019-01-14 2019-05-21 中国科学院软件研究所 A kind of machine learning application and development method and system of people in circuit
CN109960699A (en) * 2019-03-19 2019-07-02 广州供电局有限公司 Data statistics method for building up, device, computer equipment and storage medium
CN110322143A (en) * 2019-06-28 2019-10-11 深圳前海微众银行股份有限公司 Model entity management method, device, equipment and computer storage medium
CN110378564A (en) * 2019-06-18 2019-10-25 中国平安财产保险股份有限公司 Monitoring model generation method, device, terminal device and storage medium
CN110427398A (en) * 2018-04-28 2019-11-08 北京资采信息技术有限公司 A kind of model management tool based on data mining and analysis
CN111523084A (en) * 2020-04-09 2020-08-11 京东方科技集团股份有限公司 Service data prediction method and device, electronic equipment and computer readable storage medium
CN111724028A (en) * 2020-05-08 2020-09-29 中海创科技(福建)集团有限公司 Machine equipment operation analysis and mining system based on big data technology
CN111796840A (en) * 2019-04-09 2020-10-20 Oppo广东移动通信有限公司 Algorithm model updating method and device, storage medium and electronic equipment
CN111949436A (en) * 2020-08-10 2020-11-17 星辰天合(北京)数据科技有限公司 Test data verification method, verification device and computer readable storage medium
CN112085396A (en) * 2020-09-14 2020-12-15 洛阳众智软件科技股份有限公司 Algorithm model configuration method based on state-of-the-earth space planning current situation evaluation index
CN113139759A (en) * 2021-05-19 2021-07-20 杭州市电力设计院有限公司余杭分公司 Power grid data asset management method and system
CN113516514A (en) * 2021-07-21 2021-10-19 福建天晴数码有限公司 Method and system for paying data mining value by user
CN114091253A (en) * 2021-11-22 2022-02-25 国网宁夏电力有限公司电力科学研究院 Electromagnetic environment intelligent analysis method based on big data
CN114360045A (en) * 2020-10-14 2022-04-15 百度(美国)有限责任公司 Method, storage medium and detection device for operating detection device
CN114596061A (en) * 2022-03-02 2022-06-07 穗保(广州)科技有限公司 Project data management method and system based on big data
WO2022225481A1 (en) * 2021-04-19 2022-10-27 İzmi̇r Ekonomi̇ Üni̇versi̇tesi̇ Creativity readiness level (crrl) determination method and business model assessment system
CN118098476A (en) * 2024-04-28 2024-05-28 山东新时代药业有限公司 Electronic medical record qualification degree evaluation method and system based on big data

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101110089A (en) * 2007-09-04 2008-01-23 华为技术有限公司 Method and system for data digging and model building
CN101620691A (en) * 2008-06-30 2010-01-06 上海全成通信技术有限公司 Automatic data mining platform in telecommunications industry
CN101621823A (en) * 2008-06-30 2010-01-06 上海全成通信技术有限公司 Method for accurately building customer portrait of mobile communication data service
CN102693317A (en) * 2012-05-29 2012-09-26 华为软件技术有限公司 Method and device for data mining process generating

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101110089A (en) * 2007-09-04 2008-01-23 华为技术有限公司 Method and system for data digging and model building
CN101620691A (en) * 2008-06-30 2010-01-06 上海全成通信技术有限公司 Automatic data mining platform in telecommunications industry
CN101621823A (en) * 2008-06-30 2010-01-06 上海全成通信技术有限公司 Method for accurately building customer portrait of mobile communication data service
CN102693317A (en) * 2012-05-29 2012-09-26 华为软件技术有限公司 Method and device for data mining process generating

Cited By (35)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019052339A1 (en) * 2017-09-13 2019-03-21 深圳市宇数科技有限公司 Data exploration management method and system, electronic device, and storage medium
CN107919983B (en) * 2017-11-01 2020-07-10 中国科学院软件研究所 A system and method for evaluating the effectiveness of space-based information network based on data mining
CN107919983A (en) * 2017-11-01 2018-04-17 中国科学院软件研究所 A kind of space information network Effectiveness Evaluation System and method based on data mining
CN107832429A (en) * 2017-11-14 2018-03-23 广州供电局有限公司 audit data processing method and system
CN107832440B (en) * 2017-11-17 2020-10-13 北京锐安科技有限公司 Data mining method, device, server and computer readable storage medium
CN107832440A (en) * 2017-11-17 2018-03-23 北京锐安科技有限公司 A kind of data digging method, device, server and computer-readable recording medium
CN107943986A (en) * 2017-11-30 2018-04-20 睿视智觉(深圳)算法技术有限公司 A kind of big data analysis digging system
CN107943986B (en) * 2017-11-30 2022-05-17 睿视智觉(深圳)算法技术有限公司 Big data analysis mining system
CN108417270A (en) * 2018-03-01 2018-08-17 刘恩 Data model method for building up, data analysing method and data model establish device
CN108509644A (en) * 2018-04-12 2018-09-07 成都优易数据有限公司 A kind of data digging method having model pre-warning update mechanism
CN110427398A (en) * 2018-04-28 2019-11-08 北京资采信息技术有限公司 A kind of model management tool based on data mining and analysis
CN108664605A (en) * 2018-05-09 2018-10-16 北京三快在线科技有限公司 A kind of model evaluation method and system
CN109389143A (en) * 2018-06-19 2019-02-26 北京九章云极科技有限公司 A kind of Data Analysis Services system and method for automatic modeling
CN113935434A (en) * 2018-06-19 2022-01-14 北京九章云极科技有限公司 Data analysis processing system and automatic modeling method
CN109523316A (en) * 2018-11-16 2019-03-26 杭州珞珈数据科技有限公司 The automation modeling method of commerce services model
CN109783062A (en) * 2019-01-14 2019-05-21 中国科学院软件研究所 A kind of machine learning application and development method and system of people in circuit
CN109783062B (en) * 2019-01-14 2020-10-09 中国科学院软件研究所 A human-in-the-loop machine learning application development method and system
CN109960699A (en) * 2019-03-19 2019-07-02 广州供电局有限公司 Data statistics method for building up, device, computer equipment and storage medium
CN109960699B (en) * 2019-03-19 2021-11-02 广东电网有限责任公司广州供电局 Data statistics establishing method and device, computer equipment and storage medium
CN111796840A (en) * 2019-04-09 2020-10-20 Oppo广东移动通信有限公司 Algorithm model updating method and device, storage medium and electronic equipment
CN110378564A (en) * 2019-06-18 2019-10-25 中国平安财产保险股份有限公司 Monitoring model generation method, device, terminal device and storage medium
CN110322143A (en) * 2019-06-28 2019-10-11 深圳前海微众银行股份有限公司 Model entity management method, device, equipment and computer storage medium
CN110322143B (en) * 2019-06-28 2023-03-24 深圳前海微众银行股份有限公司 Model materialization management method, device, equipment and computer storage medium
CN111523084A (en) * 2020-04-09 2020-08-11 京东方科技集团股份有限公司 Service data prediction method and device, electronic equipment and computer readable storage medium
CN111724028A (en) * 2020-05-08 2020-09-29 中海创科技(福建)集团有限公司 Machine equipment operation analysis and mining system based on big data technology
CN111949436A (en) * 2020-08-10 2020-11-17 星辰天合(北京)数据科技有限公司 Test data verification method, verification device and computer readable storage medium
CN112085396A (en) * 2020-09-14 2020-12-15 洛阳众智软件科技股份有限公司 Algorithm model configuration method based on state-of-the-earth space planning current situation evaluation index
CN114360045A (en) * 2020-10-14 2022-04-15 百度(美国)有限责任公司 Method, storage medium and detection device for operating detection device
WO2022225481A1 (en) * 2021-04-19 2022-10-27 İzmi̇r Ekonomi̇ Üni̇versi̇tesi̇ Creativity readiness level (crrl) determination method and business model assessment system
CN113139759A (en) * 2021-05-19 2021-07-20 杭州市电力设计院有限公司余杭分公司 Power grid data asset management method and system
CN113139759B (en) * 2021-05-19 2024-06-04 杭州市电力设计院有限公司余杭分公司 Power grid data asset management method and system
CN113516514A (en) * 2021-07-21 2021-10-19 福建天晴数码有限公司 Method and system for paying data mining value by user
CN114091253A (en) * 2021-11-22 2022-02-25 国网宁夏电力有限公司电力科学研究院 Electromagnetic environment intelligent analysis method based on big data
CN114596061A (en) * 2022-03-02 2022-06-07 穗保(广州)科技有限公司 Project data management method and system based on big data
CN118098476A (en) * 2024-04-28 2024-05-28 山东新时代药业有限公司 Electronic medical record qualification degree evaluation method and system based on big data

Similar Documents

Publication Publication Date Title
CN107038167A (en) Big data excavating analysis system and its analysis method based on model evaluation
CN115578015A (en) The whole process supervision method, system and storage medium of sewage treatment based on Internet of Things
CN107274105B (en) Linear discriminant analysis-based multi-attribute decision tree power grid stability margin evaluation method
CN105740975B (en) A Method for Evaluation and Prediction of Equipment Defects Based on Data Correlation
CN111340063B (en) Data anomaly detection method for coal mill
CN108985380B (en) A fault identification method of switch machine based on cluster integration
CN105701596A (en) Method for lean distribution network emergency maintenance and management system based on big data technology
CN106372799B (en) A Grid Security Risk Prediction Method
CN109636171A (en) A kind of comprehensive diagnos and risk evaluating method that regional vegetation restores
CN104865827B (en) Oil pumping unit oil extraction optimization method based on multi-working-condition model
CN101826090A (en) WEB public opinion trend forecasting method based on optimal model
CN108876163A (en) The transient rotor angle stability fast evaluation method of comprehensive causality analysis and machine learning
CN104090985A (en) Active disconnection optimum fracture surface searching method based on electrical distance
CN109934456A (en) A method and system for intelligent fault detection of an acquisition, operation and maintenance system
CN109033178A (en) A method of excavating Granger causality between visibility multidimensional space-time data
CN118647092B (en) Comprehensive management method and system for power distribution communication network
CN109358608A (en) A kind of transformer state methods of risk assessment and device based on integrated study
CN110097141A (en) A kind of acquisition operational system intelligent trouble detection method
CN109492699A (en) Passway for transmitting electricity method for three-dimensional measurement and device
CN112231971A (en) A blast furnace fault diagnosis method based on relative overall trend diffusion fault sample generation
CN116307844A (en) A Method for Evaluating and Analyzing Line Loss in Low-Voltage Station Area
CN108123436B (en) Prediction Model of Voltage Over-Limit Based on Principal Component Analysis and Multiple Regression Algorithm
CN117076454B (en) Engineering quality acceptance form data structured storage method and system
CN111209158A (en) Mining monitoring method and cluster monitoring system for server cluster
CN117290405A (en) Internet of things system for quickly inquiring large-scale equipment data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20170811