CN106953910A - A kind of Hadoop calculates storage separation method - Google Patents
A kind of Hadoop calculates storage separation method Download PDFInfo
- Publication number
- CN106953910A CN106953910A CN201710161929.6A CN201710161929A CN106953910A CN 106953910 A CN106953910 A CN 106953910A CN 201710161929 A CN201710161929 A CN 201710161929A CN 106953910 A CN106953910 A CN 106953910A
- Authority
- CN
- China
- Prior art keywords
- yarn
- clusters
- kubernetes
- separation method
- hadoop
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/60—Software deployment
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1097—Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
Abstract
The present invention provides a kind of Hadoop and calculates storage separation method, including:S1:Kubernetes clusters are disposed in host;S2:By the data disk formatting of host, and it is mounted to system disk fixation catalogue;S3:The Hdfs NameNode nodes and DataNode nodes of order line deployment container;S4:Yarn deployment files are write, PetSet characteristics and ConfigMap characteristics deployment Yarn clusters based on Kubernetes clusters;S5:Carry out Yarn cluster tests.The present invention separates big data storage assembly and computation module, computation module is deployed in Kuberentes environment, keeper can dispose different computation modules according to the resource requirement of business scenario, at the end of big data task, computation module can be deleted, by without using resource discharge, so as to improve resource utilization, save cost.
Description
Technical field
The present invention relates to Hadoop cluster fields, and in particular to the separation method that a kind of Hadoop is calculated and stored.
Background technology
Hadoop clusters are generally emphasized to store together with deployment is calculated, and reduce network pressure during data transfer, with up to
To preferable performance, therefore when building Hadoop clusters, it usually needs carry out the planning of resource requirement, to meet business to depositing
The demand of the resources such as storage, calculating.It is not high to calculating execution rate request for the off-line calculation in Hadoop clusters, with net
The increase of network bandwidth, network is not gradually the bottleneck of cluster performance, it may be considered that using calculating and storage separation scheme.
The content of the invention
To solve the above problems, the present invention provides a kind of separation method of Hadoop calculating and storage.
The technical scheme is that:A kind of Hadoop calculates storage separation method, including:
S1:Kubernetes clusters are disposed in host;
S2:By the data disk formatting of host, and it is mounted to system disk fixation catalogue;
S3:The Hdfs NameNode nodes and DataNode nodes of order line deployment container;
S4:Yarn deployment files are write, PetSet characteristics and ConfigMap characteristics deployment Yarn based on Kubernetes clusters
Cluster;
S5:Carry out Yarn cluster tests.
Further, the operating system of host uses Ubuntu14.04 versions.
Further, Kubernetes clusters use flannel real-time performances Pod between across main-machine communication.
Further, in step S4, the PetSet characteristics of Kubernetes clusters are solid for the Pod distribution of operation Yarn components
The Slave nodal informations of Yarn components are configured to Yarn collection by fixed domain name, the ConfigMap characteristics of Kubernetes clusters
Group.
Further, Kubernetes clusters use the versions of Kubernetes 1.3.
The Hadoop that the present invention is provided calculates storage separation method, and based on Kubernetes, Docker, big data is stored
Component and computation module separation, Kuberentes environment is deployed in by computation module, and keeper can be according to the money of business scenario
Source demand disposes different computation modules, such as Yarn+MapReduce, Yarn+Spark, when needing to perform big data task
When, Hadoop clusters are quickly created based on Kubernetes, and perform big data task, can be with the end of big data task
Computation module is deleted, by without using resource discharge, so as to improve resource utilization, save cost.
Brief description of the drawings
Fig. 1 is specific embodiment of the invention method flow diagram.
Fig. 2 is specific embodiment of the invention Hadoop deployment of components figures.
Fig. 3 is specific embodiment of the invention Hadoop component interaction figures.
Embodiment
Below in conjunction with the accompanying drawings and the present invention will be described in detail by specific embodiment, following examples are to the present invention
Explanation, and the invention is not limited in implementation below.
As shown in figure 1, the Hadoop that the present invention is provided calculates storage method, comprise the following steps:
S1:Kubernetes clusters are disposed in host;The operating system of host uses Ubuntu14.04 versions,
Kubernetes clusters use flannel real-time performances Pod between across main-machine communication.
S2:By the data disk formatting of host, and it is mounted to system disk fixation catalogue;DataNode for Hdfs is saved
Point.
S3:The Hdfs NameNode nodes and DataNode nodes of order line deployment container;For to multiple Yarn collection
Group provides storage.
S4:Yarn deployment files are write, PetSet characteristics and the deployment of ConfigMap characteristics based on Kubernetes clusters
Yarn clusters;The PetSet characteristics of Kubernetes clusters are the fixed domain name of the Pod distribution of operation Yarn components,
The Slave nodal informations of Yarn components are configured to Yarn clusters by the ConfigMap characteristics of Kubernetes clusters.
S5:Yarn cluster tests are carried out, deployment is completed, and now, the computation module and storage assembly of Hadoop clusters are
Separation.
In step S4, Yarn deployment files are with reference to as follows:
# A headless service to create DNS records
apiVersion: v1
kind: Service
metadata:
name: mr
namespace: bigdata
labels:
app: mr
spec:
ports:
- port: 80
name: mr
# *.nginx.default.svc.cluster.local
clusterIP: None
selector:
app: mr
---
apiVersion: apps/v1alpha1
kind: PetSet
metadata:
name: mr
spec:
serviceName: "mr"
replicas: 3
template:
metadata:
labels:
app: mr
annotations:
pod.alpha.kubernetes.io/initialized: "true"
spec:
terminationGracePeriodSeconds: 0
containers:
- name: mr
image: 10.110.17.138:5000/wangdk/bigdata:v0.6
imagePullPolicy: Always
command:
- /usr/local/bin/start.sh
securityContext:
privileged: true
env:
- name: RESOURCEMANAGE_HOSTNAME
value: mr-0.mr.bigdata.svc.iopk8s.com
- name: NODE_ROLE
value: yarn
- name: NAMENODE_HOSTNAME
value: master.iop.com
- name: HDFSINFO
valueFrom:
configMapKeyRef:
name: hdfsinfo
key: hdfsinfo
- name: SLAVES
valueFrom:
configMapKeyRef:
name: hdfsinfo
key: yarnslaves
After the completion of deployment, Yarn cluster informations can be viewed:
root@master:~# kubectl get pods -o wide |grep mr-
mr-0 1/1 Running 1d 172.17.18.4 slave3.iop.com
mr-1 1/1 Running 1d 172.17.17.18 master.iop.com
mr-2 1/1 Running 1d 172.17.60.11 slave1.iop.com
In the present embodiment, it can support that, in many set Yarn clusters of Same Physical environment deployment, cluster directly completes resource by Docker
Isolation, deployment planning chart is as shown in Figures 2 and 3.
Disclosed above is only the preferred embodiment of the present invention, but the present invention is not limited to this, any this area
What technical staff can think does not have a creative change, and some improvement made without departing from the principles of the present invention and
Retouching, should all be within the scope of the present invention.
Claims (5)
1. a kind of Hadoop calculates storage separation method, it is characterised in that including:
S1:Kubernetes clusters are disposed in host;
S2:By the data disk formatting of host, and it is mounted to system disk fixation catalogue;
S3:The Hdfs NameNode nodes and DataNode nodes of order line deployment container;
S4:Yarn deployment files are write, PetSet characteristics and ConfigMap characteristics deployment Yarn based on Kubernetes clusters
Cluster;
S5:Carry out Yarn cluster tests.
2. Hadoop according to claim 1 calculates storage separation method, it is characterised in that the operating system of host is adopted
Use Ubuntu14.04 versions.
3. Hadoop according to claim 1 or 2 calculates storage separation method, it is characterised in that Kubernetes clusters
Using between flannel real-time performances Pod across main-machine communication.
4. Hadoop according to claim 3 calculates storage separation method, it is characterised in that in step S4,
The PetSet characteristics of Kubernetes clusters are the fixed domain name of the Pod distribution of operation Yarn components, Kubernetes clusters
The Slave nodal informations of Yarn components are configured to Yarn clusters by ConfigMap characteristics.
5. the Hadoop according to claim 1,2 or 4 calculates storage separation method, it is characterised in that Kubernetes collection
Mine massively with the versions of Kubernetes 1.3.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710161929.6A CN106953910A (en) | 2017-03-17 | 2017-03-17 | A kind of Hadoop calculates storage separation method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710161929.6A CN106953910A (en) | 2017-03-17 | 2017-03-17 | A kind of Hadoop calculates storage separation method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106953910A true CN106953910A (en) | 2017-07-14 |
Family
ID=59472687
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710161929.6A Pending CN106953910A (en) | 2017-03-17 | 2017-03-17 | A kind of Hadoop calculates storage separation method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106953910A (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107506282A (en) * | 2017-08-28 | 2017-12-22 | 郑州云海信息技术有限公司 | The system monitoring data capture method and system of container in a kind of docker clusters |
CN108039975A (en) * | 2017-12-21 | 2018-05-15 | 北京搜狐新媒体信息技术有限公司 | Container cluster management system and its application process |
CN109271233A (en) * | 2018-07-25 | 2019-01-25 | 上海数耕智能科技有限公司 | The implementation method of Hadoop cluster is set up based on Kubernetes |
CN109324892A (en) * | 2018-07-24 | 2019-02-12 | 北京京东尚科信息技术有限公司 | Distribution management method, distributed management system and device |
CN109542791A (en) * | 2018-11-27 | 2019-03-29 | 长沙智擎信息技术有限公司 | A kind of program large-scale concurrent evaluating method based on container technique |
CN111880934A (en) * | 2020-07-29 | 2020-11-03 | 北京浪潮数据技术有限公司 | Resource management method, device, equipment and readable storage medium |
CN112882726A (en) * | 2021-01-26 | 2021-06-01 | 西安建筑科技大学 | Hadoop and Docker-based deployment method of environment system |
CN113312165A (en) * | 2021-07-28 | 2021-08-27 | 浙江大华技术股份有限公司 | Task processing method and device |
CN116483473A (en) * | 2023-06-14 | 2023-07-25 | 浩鲸云计算科技股份有限公司 | Storage plug-in method and system based on GlusterFS |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104462811A (en) * | 2014-12-05 | 2015-03-25 | 云中万维(北京)科技有限公司 | Network game data processing method |
-
2017
- 2017-03-17 CN CN201710161929.6A patent/CN106953910A/en active Pending
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104462811A (en) * | 2014-12-05 | 2015-03-25 | 云中万维(北京)科技有限公司 | Network game data processing method |
Non-Patent Citations (2)
Title |
---|
谢丽: ""Kubernetes1.3发布,支持跨集群联合服务和有状态服务"", 《HTTPS://WWW.INFOG/CN/ARTICLE/2016/08/KUBERNETES-1.3-RELEASED》 * |
谢丽: ""将hadoop的计算和存储分开能有效的提升性能"", 《HTTPS://WWW.INFOG/CN/ARTICLE/2015/12/HADOOP-HDFS-DAS》 * |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107506282A (en) * | 2017-08-28 | 2017-12-22 | 郑州云海信息技术有限公司 | The system monitoring data capture method and system of container in a kind of docker clusters |
CN108039975A (en) * | 2017-12-21 | 2018-05-15 | 北京搜狐新媒体信息技术有限公司 | Container cluster management system and its application process |
CN108039975B (en) * | 2017-12-21 | 2020-08-28 | 北京搜狐新媒体信息技术有限公司 | Container cluster management system and application method thereof |
CN109324892A (en) * | 2018-07-24 | 2019-02-12 | 北京京东尚科信息技术有限公司 | Distribution management method, distributed management system and device |
CN109271233A (en) * | 2018-07-25 | 2019-01-25 | 上海数耕智能科技有限公司 | The implementation method of Hadoop cluster is set up based on Kubernetes |
CN109542791A (en) * | 2018-11-27 | 2019-03-29 | 长沙智擎信息技术有限公司 | A kind of program large-scale concurrent evaluating method based on container technique |
CN109542791B (en) * | 2018-11-27 | 2019-11-29 | 湖南智擎科技有限公司 | A kind of program large-scale concurrent evaluating method based on container technique |
CN111880934A (en) * | 2020-07-29 | 2020-11-03 | 北京浪潮数据技术有限公司 | Resource management method, device, equipment and readable storage medium |
CN112882726A (en) * | 2021-01-26 | 2021-06-01 | 西安建筑科技大学 | Hadoop and Docker-based deployment method of environment system |
CN112882726B (en) * | 2021-01-26 | 2022-11-15 | 西安建筑科技大学 | Hadoop and Docker-based deployment method of environment system |
CN113312165A (en) * | 2021-07-28 | 2021-08-27 | 浙江大华技术股份有限公司 | Task processing method and device |
CN113312165B (en) * | 2021-07-28 | 2021-11-16 | 浙江大华技术股份有限公司 | Task processing method and device |
CN116483473A (en) * | 2023-06-14 | 2023-07-25 | 浩鲸云计算科技股份有限公司 | Storage plug-in method and system based on GlusterFS |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106953910A (en) | A kind of Hadoop calculates storage separation method | |
US8271455B2 (en) | Storing replication requests for objects in a distributed storage system | |
US10853242B2 (en) | Deduplication and garbage collection across logical databases | |
CN103379159B (en) | A kind of method that distributed Web station data synchronizes | |
US10659225B2 (en) | Encrypting existing live unencrypted data using age-based garbage collection | |
US10445433B2 (en) | Methods and systems of query engines and secondary indexes implemented in a distributed database | |
CN107797767B (en) | One kind is based on container technique deployment distributed memory system and its storage method | |
CN102880658A (en) | Distributed file management system based on seismic data processing | |
CN111881223B (en) | Data management method, device, system and storage medium | |
CN102937964B (en) | Intelligent data service method based on distributed system | |
US11068499B2 (en) | Method, device, and system for peer-to-peer data replication and method, device, and system for master node switching | |
US11743333B2 (en) | Tiered queuing system | |
CN103002027A (en) | System and method for data storage on basis of key-value pair system tree-shaped directory achieving structure | |
CN102981933A (en) | Method and system for data increment backup of sensing layer of Internet of Things | |
CN104050276A (en) | Cache processing method and system of distributed database | |
CN103399894A (en) | Distributed transaction processing method on basis of shared storage pool | |
CN104184812A (en) | Multi-point data transmission method based on private cloud | |
US20170351620A1 (en) | Caching Framework for Big-Data Engines in the Cloud | |
CN105516284A (en) | Clustered database distributed storage method and device | |
CN111404932A (en) | Method for accessing medical institution system to smart medical cloud service platform | |
CN109639773A (en) | A kind of the distributed data cluster control system and its method of dynamic construction | |
CN107818111A (en) | A kind of method, server and the terminal of cache file data | |
CN103838515A (en) | Method and system for allowing server cluster to have access to and dispatch multi-controller disk array | |
CN103281383B (en) | A kind of time sequence information recording method of Based on Distributed data source | |
US10592479B2 (en) | Space management for a hierarchical set of file systems |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
TA01 | Transfer of patent application right | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20200518 Address after: Building S01, Inspur Science Park, No. 1036, Inspur Road, high tech Zone, Jinan City, Shandong Province, 250000 Applicant after: Tidal Cloud Information Technology Co.,Ltd. Address before: 450000 Henan province Zheng Dong New District of Zhengzhou City Xinyi Road No. 278 16 floor room 1601 Applicant before: ZHENGZHOU YUNHAI INFORMATION TECHNOLOGY Co.,Ltd. |
|
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170714 |