CN108595473A - A kind of big data application platform based on cloud computing - Google Patents
A kind of big data application platform based on cloud computing Download PDFInfo
- Publication number
- CN108595473A CN108595473A CN201810194531.7A CN201810194531A CN108595473A CN 108595473 A CN108595473 A CN 108595473A CN 201810194531 A CN201810194531 A CN 201810194531A CN 108595473 A CN108595473 A CN 108595473A
- Authority
- CN
- China
- Prior art keywords
- data
- layer
- cloud computing
- application platform
- platform based
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/20—Software design
- G06F8/22—Procedural
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- General Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The big data application platform based on cloud computing that the invention discloses a kind of, including data acquisition layer, data analysis layer, accumulation layer, computation layer, application layer and permission and resource management and control layer, the big data application platform, it is realized based on cloud computing, its accumulation layer and computation layer is set to be extended online, and there are many different computing engines for computation layer offer, to realize data prediction, data analysis and data mining, user is in operation process, it can select the computing engines for being suitble to current work, or the computing engines itself being familiar with, to mitigate the burden of operation and improve efficiency.
Description
Technical field
The present invention relates to field of computer technology more particularly to a kind of big data application platforms based on cloud computing.
Background technology
With the development of mass data poured in the epoch, each business is required for more data spaces and stronger
Data-handling capacity.In terms of data processing, single computing engines can not meet the demand of user.The meter of form
The ability to express for calculating result is very limited to.The same operation can only be with a kind of limitation of computing engines to the data processing of user
Work increases many live loads, such as the operation for needing team collaboration, different Team Members is often responsible for
Different work, each member is intended to be advantageously selected for the computing engines of work at present or itself more known calculating is drawn
It holds up, burden can be brought to individual work or team collaboration by only providing single computing engines.
Invention content
For overcome the deficiencies in the prior art, the big data application based on cloud computing that the purpose of the present invention is to provide a kind of
Platform can provide a variety of different computing engines to the user, can online be extended to computation layer and accumulation layer, and conveniently
The different work demand of user.
The purpose of the present invention adopts the following technical scheme that realization:
A kind of big data application platform based on cloud computing, including:
Data acquisition layer, is used for gathered data;
Data analysis layer is used to store in data pick-up to accumulation layer that data acquisition layer is acquired;
Accumulation layer is used to store collected Various types of data, is called for upper layer application;
Computation layer is used to provide a variety of different computing engines, to realize that data prediction, data analysis and data are dug
Pick;
Application layer is the entrance using data, for providing application module, to realize calling, inquiry and the pipe of data
Reason;
Permission and resource management and control layer cross over data acquisition layer, data analysis layer, accumulation layer, computation layer and application layer, with
Realize the unified management to user right and resource.
Further, in the data acquisition layer, the data acquired include real time data, structural data and non-knot
Structure data.
Further, in the data acquisition layer, the structural data is acquired by Oracle technologies, described real-time
Data realize that online acquisition, the unstructured data are acquired by Flume technologies by Kafka message queue technologies.
Further, when being extracted to data in the data analysis layer, extraction process includes being carried out just to data
Step cleaning, conversion and calculating, so that data form the data format for being suitable for storage in the accumulation layer.
Further, the accumulation layer includes distributed file storage system HDFS, the distributed storage system towards row
Unite HBase and key assignments storage system Redis.
Further, the computing engines that the computation layer is provided are included the parallel computation engine calculated based on memory, used
In the parallel computation engine of SQL analyses and for realizing the parallel computation engine of MapReduce tasks.
Further, the permission and resource management and control layer include centralized Log Administration System Sentry, centralization peace
Full management system Ranger and resource management system Yarn.
Further, the application module includes visualization model and cooperation programming module, and the visualization model is used for
Visualization processing is carried out to the result of calculation of the computation layer, the cooperation programming module programs for realizing team collaboration.
Compared with prior art, the beneficial effects of the present invention are:
The big data application platform based on cloud computing of the present invention is realized based on cloud computing, makes its accumulation layer and computation layer
It can be extended online, and there are many different computing engines for computation layer offer, to realize data prediction, data analysis
And data mining, user can select to be suitble to the computing engines of current work or itself be familiar in operation process
Computing engines, to mitigate the burden of operation and improve efficiency.
Description of the drawings
Fig. 1 is the system architecture diagram of the big data application platform based on cloud computing of present pre-ferred embodiments.
Specific implementation mode
In the following, in conjunction with attached drawing and specific implementation mode, the present invention is described further, it should be noted that not
Under the premise of conflicting, new implementation can be formed between various embodiments described below or between each technical characteristic in any combination
Example.
It is the system architecture diagram of the big data application platform based on cloud computing of present pre-ferred embodiments shown in Fig. 1.It should
Big data application platform includes:
Data acquisition layer, is used for gathered data;
Data analysis layer is used to store in data pick-up to accumulation layer that data acquisition layer is acquired;
Accumulation layer is used to store collected Various types of data, is called for upper layer application;
Computation layer is used to provide a variety of different computing engines, to realize that data prediction, data analysis and data are dug
Pick;
Application layer is the entrance using data, for providing application module, to realize calling, inquiry and the pipe of data
Reason;
Permission and resource management and control layer cross over data acquisition layer, data analysis layer, accumulation layer, computation layer and application layer, with
Realize the unified management to user right and resource.
As shown in Figure 1, building mode this figure provides a kind of relatively reasonable each level.The present embodiment based on cloud meter
The big data application platform of calculation, can be achieved on computing resource extend online, support team cooperation programming and be provided with
The platform of a variety of difference computing engines.
The computing resource of the platform is cloud computing resources, includes the resource of accumulation layer and computation layer so that the platform can be with
The online computing capability and processing capacity for promoting computing engines, extension storage ability.
Preferably, the data of data acquisition layer acquisition include structural data, real time data and unstructured data, tool
For body, the type of data includes business datum, historical data, daily record data and behavioral data;In addition, in data acquisition layer
In, structural data is acquired by Oracle technologies, and real time data realizes online acquisition by Kafka message queue technologies, non-
Structural data is acquired by Flume technologies.Furthermore it is also possible to using Sqoop technologies, come implementation relation type database and
Data transmission between Hadoop;In fact, data acquisition layer is equivalent to the data active layer of the big data application platform.
Preferably, when being extracted to data in data analysis layer, extraction process include to data carry out tentatively cleaning,
Conversion and calculating, so that data form the data format for being suitable for storage in the accumulation layer.Data analysis layer include Oozie,
Informatica, Spark and MR etc., wherein Oozie are a workflow engines, for assisting Hadoop job managements,
Informatica is ETL tools.
Preferably, the accumulation layer includes distributed file storage system HDFS, the distributed memory system towards row
HBase and key assignments storage system Redis, for realizing the storage and management of different types of data.In addition, accumulation layer is also wrapped
Kudu has been included, has been suitable for quickly analyzing fast-changing data, has been that one kind taking into account data update real-time and analysis speed
The storage engines of degree.
Preferably, as shown in Figure 1, computation layer includes that there are many different computing engines, generally speaking for realizing three kinds
Main function, respectively data prediction, data analysis and data mining.
Wherein, Spark is increased income cluster computing environment using the big data calculated based on memory that Scala is realized, is provided
The interfaces such as Java, Scala, Python and R language;Python can be used for carrying out data mining;
Hawq is the primary large-scale parallel SQL analysis engines of a Hadoop, is directed to analytical application.With other passes
Be type class database seemingly, receive SQL, return the result collection.But it have many traditional databases of MPP and its
His database no characteristic and function;
Hive is a Tool for Data Warehouse based on Hadoop, can the data file of structuring be mapped as a number
According to library table, and complete SQL query function is provided, SQL statement can be converted to MapReduce tasks and run.Its is excellent
Point is that learning cost is low, simple MapReduce statistics can be fast implemented by class SQL statement, it is not necessary to develop special
MapReduce is applied, and is very suitable for the statistical analysis of data warehouse.
Preferably, permission and resource management layer include centralized Log Administration System Sentry, centralized security management
System Ranger and resource management system Yarn, to carry out centralized and unified management, realization pair to user right and resource
The distribution of computing resource, the computing resource that this platform is just assigned with acquiescence when distributing account (including CPU, memory, are deposited
Storage) give account.
Preferably, the application module of application layer includes visualization model and cooperation programming module, and wherein visualization model is used
Visualization processing is carried out in the result of calculation to computation layer, cooperation programming module programs for realizing team collaboration.Specifically,
HUE and Zeppelin can be built in application layer.HUE and Zeppelin is handled for realizing interactive editing, to realize that team assists
It programs.Wherein, Zeppelin is an offer interaction data analysis and the notes (notebook) based on web.
In the present embodiment, mode is built due to application layer and computation layer so that the different paragraphs of same notes can be with
It is write using different computing engines, when executing notes, the different paragraphs in notes can be executed sequentially, and system can be automatically according to section
Corresponding computing engines are called in engine statement (interpreter binding) in falling.
The big data application platform based on cloud computing of the present embodiment, in terms of team collaboration, the authors of notes can be with
The permission of notes is shared with other user, different users can edit, run the notes together, to realize work compound,
And in the process, different users can select suitable computing engines according to current homework type or select oneself
The computing engines being familiar with improve operating efficiency to mitigate the burden of individual work and team collaboration.And by HUE and
The implementing result of notes can be carried out visualization processing by Zeppelin, and user is allowed to have more intuitive understanding to data result.
The above embodiment is only the preferred embodiment of the present invention, and the scope of protection of the present invention is not limited thereto,
The variation and replacement for any unsubstantiality that those skilled in the art is done on the basis of the present invention belong to institute of the present invention
Claimed range.
Claims (8)
1. a kind of big data application platform based on cloud computing, which is characterized in that including:
Data acquisition layer, is used for gathered data;
Data analysis layer is used to store in data pick-up to accumulation layer that data acquisition layer is acquired;
Accumulation layer is used to store collected Various types of data, is called for upper layer application;
Computation layer is used to provide a variety of different computing engines, to realize data prediction, data analysis and data mining;
Application layer is the entrance using data, for providing application module, to realize calling, inquiry and the management of data;
Permission and resource management and control layer cross over data acquisition layer, data analysis layer, accumulation layer, computation layer and application layer, to realize
Unified management to user right and resource.
2. the big data application platform based on cloud computing as described in claim 1, which is characterized in that in the data acquisition layer
In, the data acquired include real time data, structural data and unstructured data.
3. the big data application platform based on cloud computing as claimed in claim 2, which is characterized in that in the data acquisition layer
In, the structural data is acquired by Oracle technologies, and the real time data is realized online by Kafka message queue technologies
Acquisition, the unstructured data are acquired by Flume technologies.
4. the big data application platform based on cloud computing as described in claim 1, which is characterized in that in the data analysis layer
In when being extracted to data, extraction process includes to data tentatively clean, convert and calculate, so that data formation is suitable for
It is stored in the data format of the accumulation layer.
5. the big data application platform based on cloud computing as described in claim 1, which is characterized in that the accumulation layer includes
Distributed file storage system HDFS, the distributed memory system HBase towards row and key assignments storage system Redis.
6. the big data application platform based on cloud computing as described in claim 1, which is characterized in that the computation layer is provided
Computing engines include the parallel computation engine calculated based on memory, for the SQL parallel computation engines analyzed and for real
The parallel computation engine of existing MapReduce tasks.
7. the big data application platform based on cloud computing as described in claim 1, which is characterized in that the permission and resource pipe
Layer is controlled, includes centralized Log Administration System Sentry, centralized security management system Ranger and resource management system
Yarn。
8. such as big data application platform of the claim 1-7 any one of them based on cloud computing, which is characterized in that the application
Module includes visualization model and cooperation programming module, and the visualization model is used to carry out the result of calculation of the computation layer
Visualization processing, the cooperation programming module program for realizing team collaboration.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810194531.7A CN108595473A (en) | 2018-03-09 | 2018-03-09 | A kind of big data application platform based on cloud computing |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810194531.7A CN108595473A (en) | 2018-03-09 | 2018-03-09 | A kind of big data application platform based on cloud computing |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108595473A true CN108595473A (en) | 2018-09-28 |
Family
ID=63626065
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810194531.7A Pending CN108595473A (en) | 2018-03-09 | 2018-03-09 | A kind of big data application platform based on cloud computing |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108595473A (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109309686A (en) * | 2018-11-01 | 2019-02-05 | 浪潮软件集团有限公司 | Multi-tenant management method and device |
CN109471907A (en) * | 2018-11-15 | 2019-03-15 | 刘长山 | A kind of driving law-analysing system and method based on bayonet data |
CN109739663A (en) * | 2018-12-29 | 2019-05-10 | 深圳前海微众银行股份有限公司 | Job processing method, device, equipment and computer readable storage medium |
CN109740765A (en) * | 2019-01-31 | 2019-05-10 | 成都品果科技有限公司 | A kind of machine learning system building method based on Amazon server |
CN110515603A (en) * | 2019-07-09 | 2019-11-29 | 成都品果科技有限公司 | A method of deployment Spark application |
CN111721355A (en) * | 2020-05-14 | 2020-09-29 | 中铁第一勘察设计院集团有限公司 | Railway contact net monitoring data acquisition system |
WO2021047506A1 (en) * | 2019-09-11 | 2021-03-18 | 中兴通讯股份有限公司 | System and method for statistical analysis of data, and computer-readable storage medium |
CN113347170A (en) * | 2021-05-27 | 2021-09-03 | 北京计算机技术及应用研究所 | Intelligent analysis platform design method based on big data framework |
CN113377877A (en) * | 2021-08-10 | 2021-09-10 | 深圳市爱云信息科技有限公司 | Multi-engine big data platform |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1420488A (en) * | 2001-08-07 | 2003-05-28 | 陈涛 | Vedio tape picture and text data generating and coding method and picture and text data playback device |
CN105608144A (en) * | 2015-12-17 | 2016-05-25 | 山东鲁能软件技术有限公司 | Big data analysis platform device and method based on multilayer model iteration |
CN106815338A (en) * | 2016-12-25 | 2017-06-09 | 北京中海投资管理有限公司 | A kind of real-time storage of big data, treatment and inquiry system |
CN107515927A (en) * | 2017-08-24 | 2017-12-26 | 深圳市云房网络科技有限公司 | A kind of real estate user behavioural analysis platform |
CN107577805A (en) * | 2017-09-26 | 2018-01-12 | 华南理工大学 | A kind of business service system towards the analysis of daily record big data |
US20180069888A1 (en) * | 2015-08-31 | 2018-03-08 | Splunk Inc. | Identity resolution in data intake of a distributed data processing system |
-
2018
- 2018-03-09 CN CN201810194531.7A patent/CN108595473A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1420488A (en) * | 2001-08-07 | 2003-05-28 | 陈涛 | Vedio tape picture and text data generating and coding method and picture and text data playback device |
US20180069888A1 (en) * | 2015-08-31 | 2018-03-08 | Splunk Inc. | Identity resolution in data intake of a distributed data processing system |
CN105608144A (en) * | 2015-12-17 | 2016-05-25 | 山东鲁能软件技术有限公司 | Big data analysis platform device and method based on multilayer model iteration |
CN106815338A (en) * | 2016-12-25 | 2017-06-09 | 北京中海投资管理有限公司 | A kind of real-time storage of big data, treatment and inquiry system |
CN107515927A (en) * | 2017-08-24 | 2017-12-26 | 深圳市云房网络科技有限公司 | A kind of real estate user behavioural analysis platform |
CN107577805A (en) * | 2017-09-26 | 2018-01-12 | 华南理工大学 | A kind of business service system towards the analysis of daily record big data |
Non-Patent Citations (3)
Title |
---|
MIAO君: ""一文读懂大数据平台—写给大数据开发初学者的话!"", 《HTTPS://ZHUANLAN.ZHIHU.COM/P/26545566》 * |
哥不是小萝莉: ""Hadoop生态系统"", 《HTTPS://WWW.CNBLOGS.COM/SMARTLOLI/P/5640587.HTML》 * |
罗树兰: ""基于Hadoop数据处理研究及应用"", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109309686A (en) * | 2018-11-01 | 2019-02-05 | 浪潮软件集团有限公司 | Multi-tenant management method and device |
CN109471907A (en) * | 2018-11-15 | 2019-03-15 | 刘长山 | A kind of driving law-analysing system and method based on bayonet data |
CN109471907B (en) * | 2018-11-15 | 2022-04-29 | 刘长山 | Traffic law analysis system and method based on checkpoint data |
CN109739663A (en) * | 2018-12-29 | 2019-05-10 | 深圳前海微众银行股份有限公司 | Job processing method, device, equipment and computer readable storage medium |
CN109740765A (en) * | 2019-01-31 | 2019-05-10 | 成都品果科技有限公司 | A kind of machine learning system building method based on Amazon server |
CN109740765B (en) * | 2019-01-31 | 2023-05-02 | 成都品果科技有限公司 | Machine learning system building method based on Amazon network server |
CN110515603A (en) * | 2019-07-09 | 2019-11-29 | 成都品果科技有限公司 | A method of deployment Spark application |
WO2021047506A1 (en) * | 2019-09-11 | 2021-03-18 | 中兴通讯股份有限公司 | System and method for statistical analysis of data, and computer-readable storage medium |
CN111721355A (en) * | 2020-05-14 | 2020-09-29 | 中铁第一勘察设计院集团有限公司 | Railway contact net monitoring data acquisition system |
CN113347170A (en) * | 2021-05-27 | 2021-09-03 | 北京计算机技术及应用研究所 | Intelligent analysis platform design method based on big data framework |
CN113377877A (en) * | 2021-08-10 | 2021-09-10 | 深圳市爱云信息科技有限公司 | Multi-engine big data platform |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108595473A (en) | A kind of big data application platform based on cloud computing | |
CN104820670B (en) | A kind of acquisition of power information big data and storage method | |
CN105045820B (en) | Method for processing video image information of high-level data and database system | |
CN104899199B (en) | A kind of data warehouse data processing method and system | |
CN104331435B (en) | A kind of efficient mass data abstracting method of low influence based on Hadoop big data platforms | |
CN104346143B (en) | A kind of data transfer device by EBOM to MBOM | |
CN107945086A (en) | A kind of big data resource management system applied to smart city | |
CN107247799A (en) | Data processing method, system and its modeling method of compatible a variety of big data storages | |
CN106407278A (en) | Architecture design system of big data platform | |
CN104573071A (en) | Intelligent school situation analysis system and method based on megadata technology | |
CN107341205A (en) | A kind of intelligent distribution system based on big data platform | |
CN103399887A (en) | Query and statistical analysis system for mass logs | |
CN107545014A (en) | Stream calculation instant disposal system for treating based on Storm | |
CN101799808A (en) | Data processing method and system thereof | |
CN106951475A (en) | Big data distributed approach and system based on cloud computing | |
CN103699676B (en) | MSSQL SERVER based table partition and automatic maintenance method and system | |
CN106951552A (en) | A kind of user behavior data processing method based on Hadoop | |
CN104899314A (en) | Pedigree analysis method and device of data warehouse | |
CN104361091A (en) | Big data system | |
CN106202566A (en) | A kind of magnanimity electricity consumption data mixing based on big data storage system and method | |
CN107733696A (en) | A kind of machine learning and artificial intelligence application all-in-one dispositions method | |
CN105956932A (en) | Distribution and utilization data fusion method and system | |
CN107784039A (en) | A kind of data load method, apparatus and system | |
CN112948353B (en) | Data analysis method, system and storage medium applied to DAstudio | |
Zhang et al. | A 2-tier clustering algorithm with map-reduce |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180928 |
|
RJ01 | Rejection of invention patent application after publication |