CN103838617A - Method for constructing data mining platform in big data environment - Google Patents

Method for constructing data mining platform in big data environment Download PDF

Info

Publication number
CN103838617A
CN103838617A CN201410055529.3A CN201410055529A CN103838617A CN 103838617 A CN103838617 A CN 103838617A CN 201410055529 A CN201410055529 A CN 201410055529A CN 103838617 A CN103838617 A CN 103838617A
Authority
CN
China
Prior art keywords
data
virtual
virtualization
layer
language
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201410055529.3A
Other languages
Chinese (zh)
Inventor
叶枫
王亚普
周发超
周远超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hohai University HHU
Original Assignee
Hohai University HHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hohai University HHU filed Critical Hohai University HHU
Priority to CN201410055529.3A priority Critical patent/CN103838617A/en
Publication of CN103838617A publication Critical patent/CN103838617A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method for constructing a data mining platform in a big data environment. The data mining platform is suitable for processing data sets of different scales and various types and analyzing and displaying data through the rich functions of an R language. The system structure of the data mining platform is shown in the following figure and comprises a physical layer, a virtualization layer, a service layer and an application layer from bottom to top. Heterogeneous hardware resources are deployed on the physical layer. On the virtualization layer, a virtual machine group is constructed through CloudStack, and then a Hadoop environment is deployed on the virtual machine group. On the service layer, the R language is integrated, and multiple data mining functions are achieved and packaged into services. On the application layer, a clear operation interface is provided for a user to customize flow paths and configure parameters. According to the method, big data can be effectively processed, the analysis result can be effectively displayed, and high processing efficiency is achieved.

Description

The construction method of the data mining platform under large data environment
?
Technical field
The present invention relates to the construction method of the data mining platform under a kind of large data environment, in conjunction with technology such as cloud computing, virtual and Hadoop, integrated R language, is applicable to process different scales, the various data set of type, allows user to carry out data mining, analysis by the mode at Web interface.
Background technology
Along with informationalized propelling, enterprises and institutions produce or have had magnanimity business datum, are wherein containing a large amount of unknown, potential information.Data mining is a kind of new business information treatment technology, has obtained general application in fields such as bank, telecommunications, insurance, traffic, retails.By a large number of services data being extracted, change, analyzed and other modelling processing, can extract the auxiliary correct and crucial decision-making of making.The data volume of facing is increasing, is increasingly paid close attention to for excavation, the analysis of large data.But the analysis of single cpu mode is limited to memory size and computing power, make traditional data mining, analytical approach no longer valid under large data environment.
The appearance of cloud computing, provides effective approach for solving large data problem.Cloud computing, Intel Virtualization Technology can be integrated infrastructure resources effectively, for excavation, the analysis of large data provide calculating and storage capacity.Hadoop is the realization of increasing income of MapReduce programming model, for calculating and the storage of large data provide available framework.Open source software R is current quite popular data analysis, statistical cartography language, has abundant analysis module and utility, is in the industry cycle used widely.In order fully to excavate, analyze the value of large data, for user provides powerful data mining, analytic function, design an integrated R language, easy-to-use large data mining platform, there is good using value.
Summary of the invention
Goal of the invention: the invention provides the construction method of the data mining platform under a kind of large data environment, integrated R language, as data analysis engine, has designed the data mining platform that can process under large data environment.Utilize this platform to carry out data mining, user can solve some typical data mining problem, as customer segmentation, cross-selling, and the problem such as customer churn analysis, client's credit appraisal.
To achieve these goals, the architecture of constructed system is as follows:
Physical layer: formed by hardware such as server, PC, the network equipments, for large data processing provides essential hardware foundation.
Virtualization layer: adopt the cloud Platform Solution CloudStack 4.0 that increases income to build cluster virtual machine, integrate infrastructure resources, for whole system provides extendible, manageable calculating and storage capacity; Then, at virtual machine deploy Hadoop environment and MySQL cluster, for supporting read-write and the storage of large data.
Service layer: dispose RHadoop environment, R language engine can be operated on Hadoop cluster, both can give full play to the power of R language aspect statistical computation and drawing, and can utilize Hadoop to make up the deficiency of R language in the time processing large data in the ability aspect parallel computation and extendability simultaneously; Exploitation is served, and the function that the data digging method that encapsulation is used conventionally in service is realized, comprises 10 kinds of data mining algorithms of 4 large classes, respectively: classification and decision tree, SVM support vector machine and the neural network algorithm predicted; K-Mmeans, Pam, Clara, Agnes and the Diana algorithm of cluster analysis; The multiple regression of regretional analysis; The ARIMA model analysis method of time series analysis.
Application layer: the various functions that realize to user service layer in the mode at Web interface.User can set up analysis process, comprising: Data Source, selection analysis method are set, analytical parameters, data mining and analysis are set, draw analysis result and show.
Technical scheme: the construction method of the data mining platform under a kind of large data environment, comprises following several step:
Step 1: infrastructure is virtual.Adopt Intel Virtualization Technology can realize the integrated integration of main frame and storage resources and share and utilize.Facility is virtual, comprise server virtualization, Storage Virtualization, network virtualization.Mainly carry out virtually from two aspects, set up two virtual ponds and calculate virtual pond and Storage Virtualization pond.Calculate virtual pond and mainly realize applying virtual, computational resource aspect comprise server virtualization and Application Middleware virtual.Data storage virtualization is mainly realized in Storage Virtualization pond, comprises that at accumulation layer face storage hardware framework is virtual and storing software is virtual.The present invention builds the hardware such as main frame, management node, many computing nodes and the network equipment according to above-mentioned thinking, for large data processing provides essential hardware foundation.
Step 2: dispose virtual device, i.e. the stage of virtual machine instantiation.This flow process is roughly divided into following step:
(1) select virtual device and customize;
(2) preserve and customize Parameter File;
(3) the target physical machine server that selection is disposed;
(4) associated documents of copy virtual device;
(5) on target machine, start the virtual device after disposing.
Step 3: the installation of the cloud computing solution CloudStack that increases income.Use CloudStack as basis, user can create privately owned cloud computing platform quickly and easily in existing architecture, and its installation process mainly comprises the following steps:
(1) configuration installation source (management and computing node all need configuration);
(2) CloudStack Management Server is installed;
(3) MySQL database is installed;
(4) HOST main frame is installed;
(5) configuration security strategy, bridge, fire wall, NFS share etc.
Step 4: service layer: dispose RHadoop environment, R language engine can be operated on Hadoop cluster; In order to shield the complicacy of R language, need to configure JRI dynamic link library, actual computation process is realized by call R language at bottom.
Step 5: the mass data in the type of dealing with relationship database.Realize the operation to large-scale data in relevant database in conjunction with R and Hadoop.The present invention adopted a kind of can read more efficiently and the database of dealing with relationship in the solution of mass data record: by Open-Source Tools Sqoop, a large amount of data to be analyzed are output as to text data file, and upload in HDFS, be then converted into text data set is carried out to distributed treatment.
Step 6: the method for operating of procedure.The various functions that realize to user service layer in the mode at Web interface.User can, according to the self-defined analysis process of self-demand, comprise: Data Source, selection analysis method are set, analytical parameters, data mining and analysis are set, draw analysis result and show.
The present invention adopts technique scheme, has following beneficial effect:
(1) utilize cloud computing and Intel Virtualization Technology, integrate infrastructure resources, the calculating and the storage capacity that be convenient to unified management for platform provides, possess enhanced scalability.
(2) adopt optimum data processing mode for different scales data set, in the time that data scale single cpu mode can not be processed, utilize Hadoop cluster to provide support.And heartbeat mechanism when many backup policy of Hadoop storage, tasks carrying and data-base cluster and reproduction technology have guaranteed that platform possesses higher fault-tolerant ability.
(3), for solving the extensibility of data mining algorithm, use multiple Design Mode optimized interface design, the logic loose coupling of the parameter configuration interface of presentation layer and R language analysis data.
(4) provide the data mining algorithm of main flow, supported to process structuring formatted files such as () MySQL, SQLServer, txt, csv and xls, semi-structured formatted files such as () XML, HTML, destructuring image files such as () jpg, bmp and GIS base maps three major types data.
(5) integratedly in whole platform used 8 kinds of open source softwares, cost performance is high.
Accompanying drawing explanation
Fig. 1 is the architectural framework figure of the data mining platform under large data environment.
Fig. 2 is the business process map of application layer.
Embodiment
Below in conjunction with specific embodiment, further illustrate the present invention, should understand these embodiment is only not used in and limits the scope of the invention for the present invention is described, after having read the present invention, those skilled in the art all fall within the application's claims limited range to the modification of the various equivalent form of values of the present invention.
The architectural framework of the data mining platform under large data environment, as shown in Figure 1, comprises following several step:
Step 1: infrastructure is virtual.Adopt Intel Virtualization Technology can realize the integrated integration of main frame and storage resources and share and utilize, can improve resource utilization, reduce costs, can reduce again the complicacy of management.Facility is virtual, comprise server virtualization, Storage Virtualization, network virtualization.The present invention mainly carries out virtual from two aspects, set up two virtual ponds and calculate virtual pond and Storage Virtualization pond.Calculate virtual pond and mainly realize applying virtual, computational resource aspect comprise server virtualization and Application Middleware virtual.Data storage virtualization is mainly realized in Storage Virtualization pond, comprises that at accumulation layer face storage hardware framework is virtual and storing software is virtual.The present invention builds the hardware such as main frame, management node, many computing nodes and the network equipment according to above-mentioned thinking, for large data processing provides essential hardware foundation.
Step 2: dispose virtual device, i.e. the stage of virtual machine instantiation.This flow process is roughly divided into following step:
(1) select virtual device and customize;
(2) preserve and customize Parameter File;
(3) the target physical machine server that selection is disposed;
(4) associated documents of copy virtual device;
(5) on target physical machine server, start the virtual device after disposing.
Step 3: the installation of the cloud computing solution CloudStack that increases income.Use CloudStack as basis, user can create privately owned cloud computing platform quickly and easily in existing architecture, and its installation process mainly comprises the following steps:
(1) configuration installation source (management and computing node all need configuration);
(2) CloudStack Management Server is installed;
(3) MySQL database is installed;
(4) HOST main frame is installed;
(5) configuration security strategy, bridge, fire wall, NFS share etc.
Step 4: service layer: dispose RHadoop environment, R language engine can be operated on Hadoop cluster, both can give full play to the power of R language aspect statistical computation and drawing, and can utilize Hadoop to make up the deficiency of R language in the time processing large data in the ability aspect parallel computation and extendability simultaneously.Concrete configuration step is as follows: the 1. installation of Ubuntu operating system.2. building of Java environment.3. building of Hadoop environment.4. rely on the installation of storehouse (rmr, rhdfs, rhbase).In order to shield the complicacy of R language, need to configure Rserve or JRI dynamic link library, realize the Overpassing Platform by Using of R language, actual computation process is completed by call R language at bottom.Rserve is one and allows the program of the C/S structure of R language and other speech communications based on ICP/IP protocol, and its use step is as follows: 1. rely on the installation of storehouse (Rserve): install.packages (" Rserve ").2. start service: in order line, input R CMD Rserve.
Step 5: the mass data in the type of dealing with relationship database.In R, have the interface of multiple facing relation type data base management system (DBMS), but for mass data record, there is internal memory restriction and the low problem for the treatment of effeciency in R equally.The present invention realizes the operation to large-scale data in relevant database in conjunction with R and Hadoop.Hadoop provides the corresponding interface from relation data library inquiry and reading out data, although allow with relevant interface from database directly reads data log as the input of MapReduce, but treatment effeciency is lower, and a large amount of inquiry and read relational database and may greatly increase the access load of database from MapReduce program continually.The present invention adopted a kind of can read more efficiently and the database of dealing with relationship in the solution of mass data record: by Open-Source Tools Sqoop, a large amount of data to be analyzed are output as to text data file, and upload in HDFS, be then converted into text data set is carried out to distributed treatment.
Step 6: the method for operating of procedure.The various functions that realize to user service layer in the mode at Web interface.User can, according to the self-defined analysis process of self-demand, comprise: Data Source, selection analysis method are set, analytical parameters, data mining and analysis are set, draw analysis result and show, concrete operation flow as shown in Figure 2.

Claims (3)

1. a construction method for the data mining platform under large data environment, is characterized in that, comprises following several step:
Step 1: infrastructure is virtual; Adopt Intel Virtualization Technology that facility is virtual, comprise server virtualization, Storage Virtualization and the network virtualization of Physical layer, form virtualization layer; Mainly carry out virtually from two aspects, set up two virtual ponds and calculate virtual pond and Storage Virtualization pond; Calculate virtual pond and mainly realize applying virtual, computational resource aspect comprise server virtualization and Application Middleware virtual; Data storage virtualization is mainly realized in Storage Virtualization pond, comprises that at accumulation layer face storage hardware framework is virtual and storing software is virtual;
Step 2: dispose virtual device, i.e. the stage of virtual machine instantiation; This flow process is roughly divided into following step:
(1) select virtual device and customize;
(2) preserve and customize Parameter File;
(3) the target physical machine server that selection is disposed;
(4) associated documents of copy virtual device;
(5) on target machine, start the virtual device after disposing;
Step 3: the installation of the cloud computing solution CloudStack that increases income; Use CloudStack as basis, build cluster virtual machine, user can create privately owned cloud computing platform quickly and easily in existing architecture, and its installation process mainly comprises the following steps:
(1) configuration installation source;
(2) CloudStack Management Server is installed;
(3) MySQL database is installed;
(4) HOST main frame is installed;
(5) configuration security strategy, bridge, fire wall, NFS share;
Step 4: service layer: dispose RHadoop environment, R language engine can be operated on Hadoop cluster; Configuration JRI dynamic link library, makes actual computation process realize by call R language at bottom;
Step 5: the mass data in the type of dealing with relationship database; Realize the operation to large-scale data in relevant database in conjunction with R and Hadoop: by Open-Source Tools Sqoop, a large amount of data to be analyzed are output as to text data file, and text data file is uploaded in HDFS, be then converted into text data set is carried out to distributed treatment;
Step 6: the method for operating of procedure; The various functions that realize to user service layer in the mode at Web interface in application layer; User can, according to the self-defined analysis process of self-demand, comprise: Data Source, selection analysis method are set, analytical parameters, data mining and analysis are set, draw analysis result and show.
2. the construction method of the data mining platform under large data environment according to claim 1, it is characterized in that: in described service layer, used the reproduction technology of MySQL database and Open-Source Tools Sqoop to realize and between Hadoop and database, carried out customizable data pass through mechanism.
3. the construction method of the data mining platform under large data environment according to claim 1, it is characterized in that: in described application layer, design the user interface of B/S pattern, user only need utilize graphic interface to operate, carry out data analysis and statistics and do not need directly to write R code, actual computation process realizes by call R language at bottom, has fundamentally shielded the complicacy of R language.
CN201410055529.3A 2014-02-18 2014-02-18 Method for constructing data mining platform in big data environment Pending CN103838617A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410055529.3A CN103838617A (en) 2014-02-18 2014-02-18 Method for constructing data mining platform in big data environment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410055529.3A CN103838617A (en) 2014-02-18 2014-02-18 Method for constructing data mining platform in big data environment

Publications (1)

Publication Number Publication Date
CN103838617A true CN103838617A (en) 2014-06-04

Family

ID=50802150

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410055529.3A Pending CN103838617A (en) 2014-02-18 2014-02-18 Method for constructing data mining platform in big data environment

Country Status (1)

Country Link
CN (1) CN103838617A (en)

Cited By (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104102702A (en) * 2014-07-07 2014-10-15 浪潮(北京)电子信息产业有限公司 Software and hardware combined application-oriented big data system and method
CN104408167A (en) * 2014-12-09 2015-03-11 浪潮电子信息产业股份有限公司 Method for expanding sqoop function in Hue based on django
CN104536959A (en) * 2014-10-16 2015-04-22 南京邮电大学 Optimized method for accessing lots of small files for Hadoop
CN104572118A (en) * 2015-01-26 2015-04-29 武汉邮电科学研究院 Big data platform constructing method based on S-PLUS
CN104615701A (en) * 2015-01-27 2015-05-13 深圳市融创天下科技有限公司 Smart city embedded big data visualization engine cluster based on video cloud platform
CN104680434A (en) * 2015-03-05 2015-06-03 北京交通大学 Bridge structure reliability evaluation system based on big data idea
CN104899073A (en) * 2015-05-28 2015-09-09 北京邮电大学 Distributed data processing method and system
CN105787064A (en) * 2016-03-01 2016-07-20 广州铭诚计算机科技有限公司 Mining platform establishment method based on big data
CN105975574A (en) * 2016-05-04 2016-09-28 北京思特奇信息技术股份有限公司 R language-based large-data volume data screening method and system
CN106250987A (en) * 2016-07-22 2016-12-21 无锡华云数据技术服务有限公司 A kind of machine learning method, device and big data platform
CN106250429A (en) * 2016-07-26 2016-12-21 浪潮软件股份有限公司 A kind of data pick-up method based on sqoop
CN106487775A (en) * 2015-09-01 2017-03-08 阿里巴巴集团控股有限公司 A kind for the treatment of method and apparatus of the business datum based on cloud platform
CN106603624A (en) * 2016-10-27 2017-04-26 深圳市深信服电子科技有限公司 Data mining system and realization method thereof
CN106656551A (en) * 2016-10-08 2017-05-10 中国船舶重工集团公司第七�三研究所 Network service system
CN106844777A (en) * 2017-03-05 2017-06-13 何钊荣 One kind fishing information that goes to sea is shared and big data digging system
CN107025288A (en) * 2017-04-14 2017-08-08 四川九鼎瑞信软件开发有限公司 Distributed data digging method and system
CN107113231A (en) * 2015-01-08 2017-08-29 华为技术有限公司 Calculating based on figure is unloaded to rear end equipment
CN107169110A (en) * 2017-05-19 2017-09-15 肇庆市智高电机有限公司 A kind of big data collection method and system based on cloud service
CN107391688A (en) * 2017-07-25 2017-11-24 郑州云海信息技术有限公司 A kind of construction method and device of the virtualization network management platform based on web
CN107423823A (en) * 2017-08-11 2017-12-01 成都优易数据有限公司 A kind of machine learning Modeling Platform architecture design method based on R language
CN107590263A (en) * 2017-09-22 2018-01-16 辽宁工程技术大学 A kind of distributed big data sorting technique based on multi-variable decision tree-model
CN108123994A (en) * 2016-11-28 2018-06-05 中国科学院沈阳自动化研究所 A kind of cloud platform framework towards industrial circle
CN108182053A (en) * 2017-12-08 2018-06-19 北京云星宇交通科技股份有限公司 A kind of system and method for developing operation big data business application program
CN108763583A (en) * 2018-06-11 2018-11-06 山东汇贸电子口岸有限公司 A kind of microblog hot topic extracting method and system based on keyword search
CN108984718A (en) * 2018-07-10 2018-12-11 四川汇源吉迅数码科技有限公司 A kind of digital content interactive system and exchange method based on big data technology
CN109218400A (en) * 2018-08-06 2019-01-15 深圳宇翊技术股份有限公司 A kind of PIS center subsystem realized based on virtualization and distributed structure/architecture
CN109408045A (en) * 2018-11-08 2019-03-01 国久大数据有限公司 Government system integration method and device
CN106095391B (en) * 2016-05-31 2019-03-26 携程计算机技术(上海)有限公司 Calculation method and system based on big data platform and algorithm model
CN109583941A (en) * 2018-11-06 2019-04-05 汪浩 A kind of advertisement delivery system
CN109753226A (en) * 2017-11-07 2019-05-14 阿里巴巴集团控股有限公司 Data processing system, method and electronic equipment
CN109886023A (en) * 2017-12-06 2019-06-14 株洲中车时代电气股份有限公司 A kind of data processing method, device, equipment and computer readable storage medium
CN110187869A (en) * 2019-05-14 2019-08-30 上海直真君智科技有限公司 Unified inter-operation system and method between a kind of big data isomery storage computation model

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
CNBIRD2008: ""安装部署CloudStack 4.0企业私有云平台"", 《HTTP://BLOG.CSDN.NET/CNBIRD2008/ARTICLE/DETAILS/8576680》 *
YFK: ""Hadoop数据传输工具sqoop"", 《HTTP://BLOG.CSDN.NET/YFKISS/ARTICLE/DETAILS/8700480/》 *
张丹: ""让Hadoop跑在云端系列文章之创建Hadoop母体虚拟机"", 《HTTP://BLOG.FENS.ME/HADOOP-BASE-KVM/》 *
高汉松等: ""基于云计算的医疗大数据挖掘平台"", 《医学信息杂志》 *

Cited By (41)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104102702A (en) * 2014-07-07 2014-10-15 浪潮(北京)电子信息产业有限公司 Software and hardware combined application-oriented big data system and method
CN104536959B (en) * 2014-10-16 2018-03-06 南京邮电大学 A kind of optimization method of Hadoop accessing small high-volume files
CN104536959A (en) * 2014-10-16 2015-04-22 南京邮电大学 Optimized method for accessing lots of small files for Hadoop
CN104408167A (en) * 2014-12-09 2015-03-11 浪潮电子信息产业股份有限公司 Method for expanding sqoop function in Hue based on django
CN107113231A (en) * 2015-01-08 2017-08-29 华为技术有限公司 Calculating based on figure is unloaded to rear end equipment
CN107113231B (en) * 2015-01-08 2019-12-17 华为技术有限公司 Offloading graphics-based computing to a backend device
CN104572118A (en) * 2015-01-26 2015-04-29 武汉邮电科学研究院 Big data platform constructing method based on S-PLUS
CN104615701A (en) * 2015-01-27 2015-05-13 深圳市融创天下科技有限公司 Smart city embedded big data visualization engine cluster based on video cloud platform
CN104615701B (en) * 2015-01-27 2018-04-06 融创天下(上海)科技发展有限公司 The embedded big data visualization engine cluster in smart city based on video cloud platform
CN104680434A (en) * 2015-03-05 2015-06-03 北京交通大学 Bridge structure reliability evaluation system based on big data idea
CN104899073A (en) * 2015-05-28 2015-09-09 北京邮电大学 Distributed data processing method and system
CN106487775B (en) * 2015-09-01 2020-01-21 阿里巴巴集团控股有限公司 Service data processing method and device based on cloud platform
CN106487775A (en) * 2015-09-01 2017-03-08 阿里巴巴集团控股有限公司 A kind for the treatment of method and apparatus of the business datum based on cloud platform
CN105787064A (en) * 2016-03-01 2016-07-20 广州铭诚计算机科技有限公司 Mining platform establishment method based on big data
CN105975574A (en) * 2016-05-04 2016-09-28 北京思特奇信息技术股份有限公司 R language-based large-data volume data screening method and system
CN106095391B (en) * 2016-05-31 2019-03-26 携程计算机技术(上海)有限公司 Calculation method and system based on big data platform and algorithm model
CN106250987A (en) * 2016-07-22 2016-12-21 无锡华云数据技术服务有限公司 A kind of machine learning method, device and big data platform
CN106250987B (en) * 2016-07-22 2019-03-01 无锡华云数据技术服务有限公司 A kind of machine learning method, device and big data platform
CN106250429A (en) * 2016-07-26 2016-12-21 浪潮软件股份有限公司 A kind of data pick-up method based on sqoop
CN106656551A (en) * 2016-10-08 2017-05-10 中国船舶重工集团公司第七�三研究所 Network service system
CN106603624A (en) * 2016-10-27 2017-04-26 深圳市深信服电子科技有限公司 Data mining system and realization method thereof
CN108123994B (en) * 2016-11-28 2021-01-29 中国科学院沈阳自动化研究所 Industrial-field-oriented cloud platform architecture
CN108123994A (en) * 2016-11-28 2018-06-05 中国科学院沈阳自动化研究所 A kind of cloud platform framework towards industrial circle
CN106844777A (en) * 2017-03-05 2017-06-13 何钊荣 One kind fishing information that goes to sea is shared and big data digging system
CN107025288A (en) * 2017-04-14 2017-08-08 四川九鼎瑞信软件开发有限公司 Distributed data digging method and system
CN107169110A (en) * 2017-05-19 2017-09-15 肇庆市智高电机有限公司 A kind of big data collection method and system based on cloud service
CN107391688A (en) * 2017-07-25 2017-11-24 郑州云海信息技术有限公司 A kind of construction method and device of the virtualization network management platform based on web
CN107423823B (en) * 2017-08-11 2020-11-10 成都优易数据有限公司 R language-based machine learning modeling platform architecture design method
CN107423823A (en) * 2017-08-11 2017-12-01 成都优易数据有限公司 A kind of machine learning Modeling Platform architecture design method based on R language
CN107590263B (en) * 2017-09-22 2020-07-07 辽宁工程技术大学 Distributed big data classification method based on multivariate decision tree model
CN107590263A (en) * 2017-09-22 2018-01-16 辽宁工程技术大学 A kind of distributed big data sorting technique based on multi-variable decision tree-model
CN109753226A (en) * 2017-11-07 2019-05-14 阿里巴巴集团控股有限公司 Data processing system, method and electronic equipment
CN109886023B (en) * 2017-12-06 2023-05-09 株洲中车时代电气股份有限公司 Data processing method, device, equipment and computer readable storage medium
CN109886023A (en) * 2017-12-06 2019-06-14 株洲中车时代电气股份有限公司 A kind of data processing method, device, equipment and computer readable storage medium
CN108182053A (en) * 2017-12-08 2018-06-19 北京云星宇交通科技股份有限公司 A kind of system and method for developing operation big data business application program
CN108763583A (en) * 2018-06-11 2018-11-06 山东汇贸电子口岸有限公司 A kind of microblog hot topic extracting method and system based on keyword search
CN108984718A (en) * 2018-07-10 2018-12-11 四川汇源吉迅数码科技有限公司 A kind of digital content interactive system and exchange method based on big data technology
CN109218400A (en) * 2018-08-06 2019-01-15 深圳宇翊技术股份有限公司 A kind of PIS center subsystem realized based on virtualization and distributed structure/architecture
CN109583941A (en) * 2018-11-06 2019-04-05 汪浩 A kind of advertisement delivery system
CN109408045A (en) * 2018-11-08 2019-03-01 国久大数据有限公司 Government system integration method and device
CN110187869A (en) * 2019-05-14 2019-08-30 上海直真君智科技有限公司 Unified inter-operation system and method between a kind of big data isomery storage computation model

Similar Documents

Publication Publication Date Title
CN103838617A (en) Method for constructing data mining platform in big data environment
Strohbach et al. Towards a big data analytics framework for IoT and smart city applications
Mohanty Big data: An introduction
Barik et al. SOA-FOG: Secure service-oriented edge computing architecture for smart health big data analytics
Di Martino et al. Big data (lost) in the cloud
Londhe et al. Platforms for big data analytics: Trend towards hybrid era
CN104915793A (en) Public information intelligent analysis platform based on big data analysis and mining
CN105978704A (en) Creating new cloud resource instruction set architecture
Costa et al. The SusCity big data warehousing approach for smart cities
CN104794150A (en) Cloud storage model and management method based on space knowledge cloud environment
CN104572118A (en) Big data platform constructing method based on S-PLUS
Constante Nicolalde et al. Big data analytics in IOT: challenges, open research issues and tools
CN109063980A (en) Memory calculation method and system suitable for electrical network analysis
CN104299170B (en) Intermittent energy source mass data processing method
CN111951935A (en) Medical cloud system, method, system and medium for medical big data processing
Li et al. Survey of recent research progress and issues in big data
Chen et al. A decoupled execution paradigm for data-intensive high-end computing
Tsinaraki et al. Big Data–a step change for SDI?
Mangla et al. A comprehensive review: Internet of things (IoT)
Yang et al. On construction of the air pollution monitoring service with a hybrid database converter
CN106293949A (en) Resource dispatching strategy based on baseline analysis under a kind of computing environment
US20190220532A1 (en) Data processing with nullable schema information
Biswas et al. Iot, cloud and BigData integration for Iot analytics
Liu et al. Accelerating large-scale DEVS-based simulation on the cell processor
Singh et al. Big Data Knowledge Discovery as a Service: Recent Trends and Challenges

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20140604