CN103838617A - Method for constructing data mining platform in big data environment - Google Patents
Method for constructing data mining platform in big data environment Download PDFInfo
- Publication number
- CN103838617A CN103838617A CN201410055529.3A CN201410055529A CN103838617A CN 103838617 A CN103838617 A CN 103838617A CN 201410055529 A CN201410055529 A CN 201410055529A CN 103838617 A CN103838617 A CN 103838617A
- Authority
- CN
- China
- Prior art keywords
- data
- virtual
- virtualization
- layer
- language
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Landscapes
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a method for constructing a data mining platform in a big data environment. The data mining platform is suitable for processing data sets of different scales and various types and analyzing and displaying data through the rich functions of an R language. The system structure of the data mining platform is shown in the following figure and comprises a physical layer, a virtualization layer, a service layer and an application layer from bottom to top. Heterogeneous hardware resources are deployed on the physical layer. On the virtualization layer, a virtual machine group is constructed through CloudStack, and then a Hadoop environment is deployed on the virtual machine group. On the service layer, the R language is integrated, and multiple data mining functions are achieved and packaged into services. On the application layer, a clear operation interface is provided for a user to customize flow paths and configure parameters. According to the method, big data can be effectively processed, the analysis result can be effectively displayed, and high processing efficiency is achieved.
Description
?
Technical field
The present invention relates to the construction method of the data mining platform under a kind of large data environment, in conjunction with technology such as cloud computing, virtual and Hadoop, integrated R language, is applicable to process different scales, the various data set of type, allows user to carry out data mining, analysis by the mode at Web interface.
Background technology
Along with informationalized propelling, enterprises and institutions produce or have had magnanimity business datum, are wherein containing a large amount of unknown, potential information.Data mining is a kind of new business information treatment technology, has obtained general application in fields such as bank, telecommunications, insurance, traffic, retails.By a large number of services data being extracted, change, analyzed and other modelling processing, can extract the auxiliary correct and crucial decision-making of making.The data volume of facing is increasing, is increasingly paid close attention to for excavation, the analysis of large data.But the analysis of single cpu mode is limited to memory size and computing power, make traditional data mining, analytical approach no longer valid under large data environment.
The appearance of cloud computing, provides effective approach for solving large data problem.Cloud computing, Intel Virtualization Technology can be integrated infrastructure resources effectively, for excavation, the analysis of large data provide calculating and storage capacity.Hadoop is the realization of increasing income of MapReduce programming model, for calculating and the storage of large data provide available framework.Open source software R is current quite popular data analysis, statistical cartography language, has abundant analysis module and utility, is in the industry cycle used widely.In order fully to excavate, analyze the value of large data, for user provides powerful data mining, analytic function, design an integrated R language, easy-to-use large data mining platform, there is good using value.
Summary of the invention
Goal of the invention: the invention provides the construction method of the data mining platform under a kind of large data environment, integrated R language, as data analysis engine, has designed the data mining platform that can process under large data environment.Utilize this platform to carry out data mining, user can solve some typical data mining problem, as customer segmentation, cross-selling, and the problem such as customer churn analysis, client's credit appraisal.
To achieve these goals, the architecture of constructed system is as follows:
Physical layer: formed by hardware such as server, PC, the network equipments, for large data processing provides essential hardware foundation.
Virtualization layer: adopt the cloud Platform Solution CloudStack 4.0 that increases income to build cluster virtual machine, integrate infrastructure resources, for whole system provides extendible, manageable calculating and storage capacity; Then, at virtual machine deploy Hadoop environment and MySQL cluster, for supporting read-write and the storage of large data.
Service layer: dispose RHadoop environment, R language engine can be operated on Hadoop cluster, both can give full play to the power of R language aspect statistical computation and drawing, and can utilize Hadoop to make up the deficiency of R language in the time processing large data in the ability aspect parallel computation and extendability simultaneously; Exploitation is served, and the function that the data digging method that encapsulation is used conventionally in service is realized, comprises 10 kinds of data mining algorithms of 4 large classes, respectively: classification and decision tree, SVM support vector machine and the neural network algorithm predicted; K-Mmeans, Pam, Clara, Agnes and the Diana algorithm of cluster analysis; The multiple regression of regretional analysis; The ARIMA model analysis method of time series analysis.
Application layer: the various functions that realize to user service layer in the mode at Web interface.User can set up analysis process, comprising: Data Source, selection analysis method are set, analytical parameters, data mining and analysis are set, draw analysis result and show.
Technical scheme: the construction method of the data mining platform under a kind of large data environment, comprises following several step:
Step 1: infrastructure is virtual.Adopt Intel Virtualization Technology can realize the integrated integration of main frame and storage resources and share and utilize.Facility is virtual, comprise server virtualization, Storage Virtualization, network virtualization.Mainly carry out virtually from two aspects, set up two virtual ponds and calculate virtual pond and Storage Virtualization pond.Calculate virtual pond and mainly realize applying virtual, computational resource aspect comprise server virtualization and Application Middleware virtual.Data storage virtualization is mainly realized in Storage Virtualization pond, comprises that at accumulation layer face storage hardware framework is virtual and storing software is virtual.The present invention builds the hardware such as main frame, management node, many computing nodes and the network equipment according to above-mentioned thinking, for large data processing provides essential hardware foundation.
Step 2: dispose virtual device, i.e. the stage of virtual machine instantiation.This flow process is roughly divided into following step:
(1) select virtual device and customize;
(2) preserve and customize Parameter File;
(3) the target physical machine server that selection is disposed;
(4) associated documents of copy virtual device;
(5) on target machine, start the virtual device after disposing.
Step 3: the installation of the cloud computing solution CloudStack that increases income.Use CloudStack as basis, user can create privately owned cloud computing platform quickly and easily in existing architecture, and its installation process mainly comprises the following steps:
(1) configuration installation source (management and computing node all need configuration);
(2) CloudStack Management Server is installed;
(3) MySQL database is installed;
(4) HOST main frame is installed;
(5) configuration security strategy, bridge, fire wall, NFS share etc.
Step 4: service layer: dispose RHadoop environment, R language engine can be operated on Hadoop cluster; In order to shield the complicacy of R language, need to configure JRI dynamic link library, actual computation process is realized by call R language at bottom.
Step 5: the mass data in the type of dealing with relationship database.Realize the operation to large-scale data in relevant database in conjunction with R and Hadoop.The present invention adopted a kind of can read more efficiently and the database of dealing with relationship in the solution of mass data record: by Open-Source Tools Sqoop, a large amount of data to be analyzed are output as to text data file, and upload in HDFS, be then converted into text data set is carried out to distributed treatment.
Step 6: the method for operating of procedure.The various functions that realize to user service layer in the mode at Web interface.User can, according to the self-defined analysis process of self-demand, comprise: Data Source, selection analysis method are set, analytical parameters, data mining and analysis are set, draw analysis result and show.
The present invention adopts technique scheme, has following beneficial effect:
(1) utilize cloud computing and Intel Virtualization Technology, integrate infrastructure resources, the calculating and the storage capacity that be convenient to unified management for platform provides, possess enhanced scalability.
(2) adopt optimum data processing mode for different scales data set, in the time that data scale single cpu mode can not be processed, utilize Hadoop cluster to provide support.And heartbeat mechanism when many backup policy of Hadoop storage, tasks carrying and data-base cluster and reproduction technology have guaranteed that platform possesses higher fault-tolerant ability.
(3), for solving the extensibility of data mining algorithm, use multiple Design Mode optimized interface design, the logic loose coupling of the parameter configuration interface of presentation layer and R language analysis data.
(4) provide the data mining algorithm of main flow, supported to process structuring formatted files such as () MySQL, SQLServer, txt, csv and xls, semi-structured formatted files such as () XML, HTML, destructuring image files such as () jpg, bmp and GIS base maps three major types data.
(5) integratedly in whole platform used 8 kinds of open source softwares, cost performance is high.
Accompanying drawing explanation
Fig. 1 is the architectural framework figure of the data mining platform under large data environment.
Fig. 2 is the business process map of application layer.
Embodiment
Below in conjunction with specific embodiment, further illustrate the present invention, should understand these embodiment is only not used in and limits the scope of the invention for the present invention is described, after having read the present invention, those skilled in the art all fall within the application's claims limited range to the modification of the various equivalent form of values of the present invention.
The architectural framework of the data mining platform under large data environment, as shown in Figure 1, comprises following several step:
Step 1: infrastructure is virtual.Adopt Intel Virtualization Technology can realize the integrated integration of main frame and storage resources and share and utilize, can improve resource utilization, reduce costs, can reduce again the complicacy of management.Facility is virtual, comprise server virtualization, Storage Virtualization, network virtualization.The present invention mainly carries out virtual from two aspects, set up two virtual ponds and calculate virtual pond and Storage Virtualization pond.Calculate virtual pond and mainly realize applying virtual, computational resource aspect comprise server virtualization and Application Middleware virtual.Data storage virtualization is mainly realized in Storage Virtualization pond, comprises that at accumulation layer face storage hardware framework is virtual and storing software is virtual.The present invention builds the hardware such as main frame, management node, many computing nodes and the network equipment according to above-mentioned thinking, for large data processing provides essential hardware foundation.
Step 2: dispose virtual device, i.e. the stage of virtual machine instantiation.This flow process is roughly divided into following step:
(1) select virtual device and customize;
(2) preserve and customize Parameter File;
(3) the target physical machine server that selection is disposed;
(4) associated documents of copy virtual device;
(5) on target physical machine server, start the virtual device after disposing.
Step 3: the installation of the cloud computing solution CloudStack that increases income.Use CloudStack as basis, user can create privately owned cloud computing platform quickly and easily in existing architecture, and its installation process mainly comprises the following steps:
(1) configuration installation source (management and computing node all need configuration);
(2) CloudStack Management Server is installed;
(3) MySQL database is installed;
(4) HOST main frame is installed;
(5) configuration security strategy, bridge, fire wall, NFS share etc.
Step 4: service layer: dispose RHadoop environment, R language engine can be operated on Hadoop cluster, both can give full play to the power of R language aspect statistical computation and drawing, and can utilize Hadoop to make up the deficiency of R language in the time processing large data in the ability aspect parallel computation and extendability simultaneously.Concrete configuration step is as follows: the 1. installation of Ubuntu operating system.2. building of Java environment.3. building of Hadoop environment.4. rely on the installation of storehouse (rmr, rhdfs, rhbase).In order to shield the complicacy of R language, need to configure Rserve or JRI dynamic link library, realize the Overpassing Platform by Using of R language, actual computation process is completed by call R language at bottom.Rserve is one and allows the program of the C/S structure of R language and other speech communications based on ICP/IP protocol, and its use step is as follows: 1. rely on the installation of storehouse (Rserve): install.packages (" Rserve ").2. start service: in order line, input R CMD Rserve.
Step 5: the mass data in the type of dealing with relationship database.In R, have the interface of multiple facing relation type data base management system (DBMS), but for mass data record, there is internal memory restriction and the low problem for the treatment of effeciency in R equally.The present invention realizes the operation to large-scale data in relevant database in conjunction with R and Hadoop.Hadoop provides the corresponding interface from relation data library inquiry and reading out data, although allow with relevant interface from database directly reads data log as the input of MapReduce, but treatment effeciency is lower, and a large amount of inquiry and read relational database and may greatly increase the access load of database from MapReduce program continually.The present invention adopted a kind of can read more efficiently and the database of dealing with relationship in the solution of mass data record: by Open-Source Tools Sqoop, a large amount of data to be analyzed are output as to text data file, and upload in HDFS, be then converted into text data set is carried out to distributed treatment.
Step 6: the method for operating of procedure.The various functions that realize to user service layer in the mode at Web interface.User can, according to the self-defined analysis process of self-demand, comprise: Data Source, selection analysis method are set, analytical parameters, data mining and analysis are set, draw analysis result and show, concrete operation flow as shown in Figure 2.
Claims (3)
1. a construction method for the data mining platform under large data environment, is characterized in that, comprises following several step:
Step 1: infrastructure is virtual; Adopt Intel Virtualization Technology that facility is virtual, comprise server virtualization, Storage Virtualization and the network virtualization of Physical layer, form virtualization layer; Mainly carry out virtually from two aspects, set up two virtual ponds and calculate virtual pond and Storage Virtualization pond; Calculate virtual pond and mainly realize applying virtual, computational resource aspect comprise server virtualization and Application Middleware virtual; Data storage virtualization is mainly realized in Storage Virtualization pond, comprises that at accumulation layer face storage hardware framework is virtual and storing software is virtual;
Step 2: dispose virtual device, i.e. the stage of virtual machine instantiation; This flow process is roughly divided into following step:
(1) select virtual device and customize;
(2) preserve and customize Parameter File;
(3) the target physical machine server that selection is disposed;
(4) associated documents of copy virtual device;
(5) on target machine, start the virtual device after disposing;
Step 3: the installation of the cloud computing solution CloudStack that increases income; Use CloudStack as basis, build cluster virtual machine, user can create privately owned cloud computing platform quickly and easily in existing architecture, and its installation process mainly comprises the following steps:
(1) configuration installation source;
(2) CloudStack Management Server is installed;
(3) MySQL database is installed;
(4) HOST main frame is installed;
(5) configuration security strategy, bridge, fire wall, NFS share;
Step 4: service layer: dispose RHadoop environment, R language engine can be operated on Hadoop cluster; Configuration JRI dynamic link library, makes actual computation process realize by call R language at bottom;
Step 5: the mass data in the type of dealing with relationship database; Realize the operation to large-scale data in relevant database in conjunction with R and Hadoop: by Open-Source Tools Sqoop, a large amount of data to be analyzed are output as to text data file, and text data file is uploaded in HDFS, be then converted into text data set is carried out to distributed treatment;
Step 6: the method for operating of procedure; The various functions that realize to user service layer in the mode at Web interface in application layer; User can, according to the self-defined analysis process of self-demand, comprise: Data Source, selection analysis method are set, analytical parameters, data mining and analysis are set, draw analysis result and show.
2. the construction method of the data mining platform under large data environment according to claim 1, it is characterized in that: in described service layer, used the reproduction technology of MySQL database and Open-Source Tools Sqoop to realize and between Hadoop and database, carried out customizable data pass through mechanism.
3. the construction method of the data mining platform under large data environment according to claim 1, it is characterized in that: in described application layer, design the user interface of B/S pattern, user only need utilize graphic interface to operate, carry out data analysis and statistics and do not need directly to write R code, actual computation process realizes by call R language at bottom, has fundamentally shielded the complicacy of R language.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410055529.3A CN103838617A (en) | 2014-02-18 | 2014-02-18 | Method for constructing data mining platform in big data environment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410055529.3A CN103838617A (en) | 2014-02-18 | 2014-02-18 | Method for constructing data mining platform in big data environment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN103838617A true CN103838617A (en) | 2014-06-04 |
Family
ID=50802150
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410055529.3A Pending CN103838617A (en) | 2014-02-18 | 2014-02-18 | Method for constructing data mining platform in big data environment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103838617A (en) |
Cited By (32)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104102702A (en) * | 2014-07-07 | 2014-10-15 | 浪潮(北京)电子信息产业有限公司 | Software and hardware combined application-oriented big data system and method |
CN104408167A (en) * | 2014-12-09 | 2015-03-11 | 浪潮电子信息产业股份有限公司 | Method for expanding sqoop function in Hue based on django |
CN104536959A (en) * | 2014-10-16 | 2015-04-22 | 南京邮电大学 | Optimized method for accessing lots of small files for Hadoop |
CN104572118A (en) * | 2015-01-26 | 2015-04-29 | 武汉邮电科学研究院 | Big data platform constructing method based on S-PLUS |
CN104615701A (en) * | 2015-01-27 | 2015-05-13 | 深圳市融创天下科技有限公司 | Smart city embedded big data visualization engine cluster based on video cloud platform |
CN104680434A (en) * | 2015-03-05 | 2015-06-03 | 北京交通大学 | Bridge structure reliability evaluation system based on big data idea |
CN104899073A (en) * | 2015-05-28 | 2015-09-09 | 北京邮电大学 | Distributed data processing method and system |
CN105787064A (en) * | 2016-03-01 | 2016-07-20 | 广州铭诚计算机科技有限公司 | Mining platform establishment method based on big data |
CN105975574A (en) * | 2016-05-04 | 2016-09-28 | 北京思特奇信息技术股份有限公司 | R language-based large-data volume data screening method and system |
CN106250987A (en) * | 2016-07-22 | 2016-12-21 | 无锡华云数据技术服务有限公司 | A kind of machine learning method, device and big data platform |
CN106250429A (en) * | 2016-07-26 | 2016-12-21 | 浪潮软件股份有限公司 | A kind of data pick-up method based on sqoop |
CN106487775A (en) * | 2015-09-01 | 2017-03-08 | 阿里巴巴集团控股有限公司 | A kind for the treatment of method and apparatus of the business datum based on cloud platform |
CN106603624A (en) * | 2016-10-27 | 2017-04-26 | 深圳市深信服电子科技有限公司 | Data mining system and realization method thereof |
CN106656551A (en) * | 2016-10-08 | 2017-05-10 | 中国船舶重工集团公司第七�三研究所 | Network service system |
CN106844777A (en) * | 2017-03-05 | 2017-06-13 | 何钊荣 | One kind fishing information that goes to sea is shared and big data digging system |
CN107025288A (en) * | 2017-04-14 | 2017-08-08 | 四川九鼎瑞信软件开发有限公司 | Distributed data digging method and system |
CN107113231A (en) * | 2015-01-08 | 2017-08-29 | 华为技术有限公司 | Calculating based on figure is unloaded to rear end equipment |
CN107169110A (en) * | 2017-05-19 | 2017-09-15 | 肇庆市智高电机有限公司 | A kind of big data collection method and system based on cloud service |
CN107391688A (en) * | 2017-07-25 | 2017-11-24 | 郑州云海信息技术有限公司 | A kind of construction method and device of the virtualization network management platform based on web |
CN107423823A (en) * | 2017-08-11 | 2017-12-01 | 成都优易数据有限公司 | A kind of machine learning Modeling Platform architecture design method based on R language |
CN107590263A (en) * | 2017-09-22 | 2018-01-16 | 辽宁工程技术大学 | A kind of distributed big data sorting technique based on multi-variable decision tree-model |
CN108123994A (en) * | 2016-11-28 | 2018-06-05 | 中国科学院沈阳自动化研究所 | A kind of cloud platform framework towards industrial circle |
CN108182053A (en) * | 2017-12-08 | 2018-06-19 | 北京云星宇交通科技股份有限公司 | A kind of system and method for developing operation big data business application program |
CN108763583A (en) * | 2018-06-11 | 2018-11-06 | 山东汇贸电子口岸有限公司 | A kind of microblog hot topic extracting method and system based on keyword search |
CN108984718A (en) * | 2018-07-10 | 2018-12-11 | 四川汇源吉迅数码科技有限公司 | A kind of digital content interactive system and exchange method based on big data technology |
CN109218400A (en) * | 2018-08-06 | 2019-01-15 | 深圳宇翊技术股份有限公司 | A kind of PIS center subsystem realized based on virtualization and distributed structure/architecture |
CN109408045A (en) * | 2018-11-08 | 2019-03-01 | 国久大数据有限公司 | Government system integration method and device |
CN106095391B (en) * | 2016-05-31 | 2019-03-26 | 携程计算机技术(上海)有限公司 | Calculation method and system based on big data platform and algorithm model |
CN109583941A (en) * | 2018-11-06 | 2019-04-05 | 汪浩 | A kind of advertisement delivery system |
CN109753226A (en) * | 2017-11-07 | 2019-05-14 | 阿里巴巴集团控股有限公司 | Data processing system, method and electronic equipment |
CN109886023A (en) * | 2017-12-06 | 2019-06-14 | 株洲中车时代电气股份有限公司 | A kind of data processing method, device, equipment and computer readable storage medium |
CN110187869A (en) * | 2019-05-14 | 2019-08-30 | 上海直真君智科技有限公司 | Unified inter-operation system and method between a kind of big data isomery storage computation model |
-
2014
- 2014-02-18 CN CN201410055529.3A patent/CN103838617A/en active Pending
Non-Patent Citations (4)
Title |
---|
CNBIRD2008: ""安装部署CloudStack 4.0企业私有云平台"", 《HTTP://BLOG.CSDN.NET/CNBIRD2008/ARTICLE/DETAILS/8576680》 * |
YFK: ""Hadoop数据传输工具sqoop"", 《HTTP://BLOG.CSDN.NET/YFKISS/ARTICLE/DETAILS/8700480/》 * |
张丹: ""让Hadoop跑在云端系列文章之创建Hadoop母体虚拟机"", 《HTTP://BLOG.FENS.ME/HADOOP-BASE-KVM/》 * |
高汉松等: ""基于云计算的医疗大数据挖掘平台"", 《医学信息杂志》 * |
Cited By (41)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104102702A (en) * | 2014-07-07 | 2014-10-15 | 浪潮(北京)电子信息产业有限公司 | Software and hardware combined application-oriented big data system and method |
CN104536959B (en) * | 2014-10-16 | 2018-03-06 | 南京邮电大学 | A kind of optimization method of Hadoop accessing small high-volume files |
CN104536959A (en) * | 2014-10-16 | 2015-04-22 | 南京邮电大学 | Optimized method for accessing lots of small files for Hadoop |
CN104408167A (en) * | 2014-12-09 | 2015-03-11 | 浪潮电子信息产业股份有限公司 | Method for expanding sqoop function in Hue based on django |
CN107113231A (en) * | 2015-01-08 | 2017-08-29 | 华为技术有限公司 | Calculating based on figure is unloaded to rear end equipment |
CN107113231B (en) * | 2015-01-08 | 2019-12-17 | 华为技术有限公司 | Offloading graphics-based computing to a backend device |
CN104572118A (en) * | 2015-01-26 | 2015-04-29 | 武汉邮电科学研究院 | Big data platform constructing method based on S-PLUS |
CN104615701A (en) * | 2015-01-27 | 2015-05-13 | 深圳市融创天下科技有限公司 | Smart city embedded big data visualization engine cluster based on video cloud platform |
CN104615701B (en) * | 2015-01-27 | 2018-04-06 | 融创天下(上海)科技发展有限公司 | The embedded big data visualization engine cluster in smart city based on video cloud platform |
CN104680434A (en) * | 2015-03-05 | 2015-06-03 | 北京交通大学 | Bridge structure reliability evaluation system based on big data idea |
CN104899073A (en) * | 2015-05-28 | 2015-09-09 | 北京邮电大学 | Distributed data processing method and system |
CN106487775B (en) * | 2015-09-01 | 2020-01-21 | 阿里巴巴集团控股有限公司 | Service data processing method and device based on cloud platform |
CN106487775A (en) * | 2015-09-01 | 2017-03-08 | 阿里巴巴集团控股有限公司 | A kind for the treatment of method and apparatus of the business datum based on cloud platform |
CN105787064A (en) * | 2016-03-01 | 2016-07-20 | 广州铭诚计算机科技有限公司 | Mining platform establishment method based on big data |
CN105975574A (en) * | 2016-05-04 | 2016-09-28 | 北京思特奇信息技术股份有限公司 | R language-based large-data volume data screening method and system |
CN106095391B (en) * | 2016-05-31 | 2019-03-26 | 携程计算机技术(上海)有限公司 | Calculation method and system based on big data platform and algorithm model |
CN106250987A (en) * | 2016-07-22 | 2016-12-21 | 无锡华云数据技术服务有限公司 | A kind of machine learning method, device and big data platform |
CN106250987B (en) * | 2016-07-22 | 2019-03-01 | 无锡华云数据技术服务有限公司 | A kind of machine learning method, device and big data platform |
CN106250429A (en) * | 2016-07-26 | 2016-12-21 | 浪潮软件股份有限公司 | A kind of data pick-up method based on sqoop |
CN106656551A (en) * | 2016-10-08 | 2017-05-10 | 中国船舶重工集团公司第七�三研究所 | Network service system |
CN106603624A (en) * | 2016-10-27 | 2017-04-26 | 深圳市深信服电子科技有限公司 | Data mining system and realization method thereof |
CN108123994B (en) * | 2016-11-28 | 2021-01-29 | 中国科学院沈阳自动化研究所 | Industrial-field-oriented cloud platform architecture |
CN108123994A (en) * | 2016-11-28 | 2018-06-05 | 中国科学院沈阳自动化研究所 | A kind of cloud platform framework towards industrial circle |
CN106844777A (en) * | 2017-03-05 | 2017-06-13 | 何钊荣 | One kind fishing information that goes to sea is shared and big data digging system |
CN107025288A (en) * | 2017-04-14 | 2017-08-08 | 四川九鼎瑞信软件开发有限公司 | Distributed data digging method and system |
CN107169110A (en) * | 2017-05-19 | 2017-09-15 | 肇庆市智高电机有限公司 | A kind of big data collection method and system based on cloud service |
CN107391688A (en) * | 2017-07-25 | 2017-11-24 | 郑州云海信息技术有限公司 | A kind of construction method and device of the virtualization network management platform based on web |
CN107423823B (en) * | 2017-08-11 | 2020-11-10 | 成都优易数据有限公司 | R language-based machine learning modeling platform architecture design method |
CN107423823A (en) * | 2017-08-11 | 2017-12-01 | 成都优易数据有限公司 | A kind of machine learning Modeling Platform architecture design method based on R language |
CN107590263B (en) * | 2017-09-22 | 2020-07-07 | 辽宁工程技术大学 | Distributed big data classification method based on multivariate decision tree model |
CN107590263A (en) * | 2017-09-22 | 2018-01-16 | 辽宁工程技术大学 | A kind of distributed big data sorting technique based on multi-variable decision tree-model |
CN109753226A (en) * | 2017-11-07 | 2019-05-14 | 阿里巴巴集团控股有限公司 | Data processing system, method and electronic equipment |
CN109886023B (en) * | 2017-12-06 | 2023-05-09 | 株洲中车时代电气股份有限公司 | Data processing method, device, equipment and computer readable storage medium |
CN109886023A (en) * | 2017-12-06 | 2019-06-14 | 株洲中车时代电气股份有限公司 | A kind of data processing method, device, equipment and computer readable storage medium |
CN108182053A (en) * | 2017-12-08 | 2018-06-19 | 北京云星宇交通科技股份有限公司 | A kind of system and method for developing operation big data business application program |
CN108763583A (en) * | 2018-06-11 | 2018-11-06 | 山东汇贸电子口岸有限公司 | A kind of microblog hot topic extracting method and system based on keyword search |
CN108984718A (en) * | 2018-07-10 | 2018-12-11 | 四川汇源吉迅数码科技有限公司 | A kind of digital content interactive system and exchange method based on big data technology |
CN109218400A (en) * | 2018-08-06 | 2019-01-15 | 深圳宇翊技术股份有限公司 | A kind of PIS center subsystem realized based on virtualization and distributed structure/architecture |
CN109583941A (en) * | 2018-11-06 | 2019-04-05 | 汪浩 | A kind of advertisement delivery system |
CN109408045A (en) * | 2018-11-08 | 2019-03-01 | 国久大数据有限公司 | Government system integration method and device |
CN110187869A (en) * | 2019-05-14 | 2019-08-30 | 上海直真君智科技有限公司 | Unified inter-operation system and method between a kind of big data isomery storage computation model |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103838617A (en) | Method for constructing data mining platform in big data environment | |
Strohbach et al. | Towards a big data analytics framework for IoT and smart city applications | |
Mohanty | Big data: An introduction | |
Barik et al. | SOA-FOG: Secure service-oriented edge computing architecture for smart health big data analytics | |
Di Martino et al. | Big data (lost) in the cloud | |
Londhe et al. | Platforms for big data analytics: Trend towards hybrid era | |
CN104915793A (en) | Public information intelligent analysis platform based on big data analysis and mining | |
CN105978704A (en) | Creating new cloud resource instruction set architecture | |
Costa et al. | The SusCity big data warehousing approach for smart cities | |
CN104794150A (en) | Cloud storage model and management method based on space knowledge cloud environment | |
CN104572118A (en) | Big data platform constructing method based on S-PLUS | |
Constante Nicolalde et al. | Big data analytics in IOT: challenges, open research issues and tools | |
CN109063980A (en) | Memory calculation method and system suitable for electrical network analysis | |
CN104299170B (en) | Intermittent energy source mass data processing method | |
CN111951935A (en) | Medical cloud system, method, system and medium for medical big data processing | |
Li et al. | Survey of recent research progress and issues in big data | |
Chen et al. | A decoupled execution paradigm for data-intensive high-end computing | |
Tsinaraki et al. | Big Data–a step change for SDI? | |
Mangla et al. | A comprehensive review: Internet of things (IoT) | |
Yang et al. | On construction of the air pollution monitoring service with a hybrid database converter | |
CN106293949A (en) | Resource dispatching strategy based on baseline analysis under a kind of computing environment | |
US20190220532A1 (en) | Data processing with nullable schema information | |
Biswas et al. | Iot, cloud and BigData integration for Iot analytics | |
Liu et al. | Accelerating large-scale DEVS-based simulation on the cell processor | |
Singh et al. | Big Data Knowledge Discovery as a Service: Recent Trends and Challenges |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20140604 |