CN106383845A - Shared storage-based MPP database data redistribution system - Google Patents

Shared storage-based MPP database data redistribution system Download PDF

Info

Publication number
CN106383845A
CN106383845A CN201610777712.3A CN201610777712A CN106383845A CN 106383845 A CN106383845 A CN 106383845A CN 201610777712 A CN201610777712 A CN 201610777712A CN 106383845 A CN106383845 A CN 106383845A
Authority
CN
China
Prior art keywords
mpp
node
shared storage
database data
redistribution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610777712.3A
Other languages
Chinese (zh)
Inventor
武新
崔维力
李春华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
TIANJIN NANKAI UNIVERSITY GENERAL DATA TECHNOLOGIES Co Ltd
Original Assignee
TIANJIN NANKAI UNIVERSITY GENERAL DATA TECHNOLOGIES Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by TIANJIN NANKAI UNIVERSITY GENERAL DATA TECHNOLOGIES Co Ltd filed Critical TIANJIN NANKAI UNIVERSITY GENERAL DATA TECHNOLOGIES Co Ltd
Priority to CN201610777712.3A priority Critical patent/CN106383845A/en
Publication of CN106383845A publication Critical patent/CN106383845A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases

Abstract

The invention provides a shared storage-based MPP database data redistribution system. The shared storage-based MPP database data redistribution system includes a shared storage system, an MPP cluster management node and MPP cluster distributed computation nodes. The shared storage-based MPP database data redistribution system is used for solving the performance problem of the redistribution of data in the existing MPP databases. Through the shared storage-based MPP database data redistribution system, the redistribution of the data can be rapidly realized by the MPP databases according to a distributed storage system when computation node undergoes expansion, so that the performance problem and the business concurrence problem of the redistribution of the existing MPP system data are avoided and the online businesses are hardly influenced.

Description

A kind of MPP database data redistribution system based on shared storage
Technical field
The present invention relates to database field, more particularly, to a kind of MPP database data redistribution system based on shared storage System.
Background technology
With the informationalized development of every profession and trade, data scale is increasing, for calculating and storage all brings huge chooses War.Existing MPP data-base cluster, is also constantly faced with new challenges, and especially can not meet demand in storage and computing capability When, in the urgent need to MPP data-base cluster is carried out with calculate node or the extension of memory node.Increase merely calculate node and deposit Storage node, this brings new problem, and that is, legacy data needs to carry out fast resampling to new node, to adapt to extend the performance brought With storage capacity lifting.But it is limited by current mechanism, be difficult to existing large-scale data is completed data at short notice divide again Have with business during cloth, and fast resampling and conflict.
Content of the invention
The invention solves the problems that above technical problem, provide a kind of MPP database data redistribution system based on shared storage System, this method can realize the quick redistribution of data, solves redistribution performance and concurrent problem.
For solving above-mentioned technical problem, the technical solution used in the present invention is:A kind of MPP database based on shared storage Fast resampling system, including shared memory systems, MPP cluster management node and the cluster distributed calculate node of MPP,
Described MPP cluster management node, is responsible for storage and calculate node management,
The cluster distributed calculate node of described MPP, is responsible for accepting described MPP cluster management node instruction, and execution is distributed The node calculating.
Described shared memory systems include but is not limited to DAS, NAS, and SAN file system distributed or non-distributed are deposited Storage system.
Described node administration includes associate management, the management of hash bucket.
The present invention has the advantages and positive effects that:A kind of MPP database data redistribution system based on shared storage System, for solving the performance issue of fast resampling in existing MPP database.The method can according to distributed memory system, Make MPP database in calculate node dilatation, quickly realize the redistribution of data, it is to avoid existing mpp system fast resampling Performance issue and service concurrence problem, make to have little influence in line service.
Brief description
Fig. 1 is a kind of example deployment figure of the MPP database data redistribution system based on shared storage.
Specific embodiment
Below in conjunction with the accompanying drawings the specific embodiment of the present invention is elaborated.
As shown in figure 1, a kind of MPP database data redistribution system based on shared storage, including shared memory systems, MPP cluster management node and the cluster distributed calculate node of MPP,
Described MPP cluster management node, is responsible for storage and calculate node management,
The cluster distributed calculate node of described MPP, is responsible for accepting described MPP cluster management node instruction, and execution is distributed The node calculating.
Described shared memory systems include but is not limited to DAS, NAS, and SAN file system distributed or non-distributed are deposited Storage system.
Described node administration includes associate management, the management of hash bucket.
A kind of MPP database data redistribution system based on shared storage, comprises the following steps:
1) shared memory systems provide storage, set up 65536 memory cell, and memory cell id scope is d0-d65535. Memory cell is mounted to calculate node by mount mode.
2) the N number of calculate node of MPP cluster, id scope is n1-nN, each node distribution 65536/N hash bucket, hash bucket Id is h0-h65535.Each hash bucket corresponds to a memory cell.
3) MPP management node, nodemap records the relation of nodeid and hash bucket, corresponding multiple hash on each node Bucket.
I.e. n1:H0, hN ...;
n2:H1, hN+1 ...;
nN:HN-1 ... ..., h63535
Storagemap records the relation of hash bucket and storageid, and each hash bucket corresponds to a storage unit. As:
h0:d0
h1:d1
h2:d2
……
h65535:d65535
4) each node shares the memory cell of storage accordingly according to nodemap carry, you can realize node to storage Access.
5) when extending calculate node, increased M node, that is, nodes are N+M it is only necessary to update nodemap, again Specify the hash bucket of each node distribution, node press new nodemap carry memory cell, need not true mobile data, Can achieve the redistribution of legacy data, and new data presses new nodemap storage.
6) if there is operation business during updating nodemap, keep old nodemap effectively until business completes again Release, the therefore impact to business is negligible.
7) above method, to extension calculate node, it is to avoid the performance issue of MPP Database Systems redistribution and business Concurrent problem.
A kind of preferred forms of the MPP database data redistribution system based on shared storage, comprise the following steps:
1) suppose that shared memory systems adopt distributed file system, set up 65536 catalogues as memory cell, and do Derive to all calculate nodes.
2) there is MPP cluster, wherein calculate node assumes there are 32, and nodeid is respectively n1-n32, and each node corresponds to There are 2048 hash buckets.
3) MPP management node, nodemap record is as follows:
n1:H0, h32, h64 ...
n2:H1, h33, h65 ...
……
n32:H31, h63, h95 ... h65535
Storagemap record is as follows:
h0:d0
h1:d1
h2:d2
……
h65535:d65535
4) each node, according to nodemap, the corresponding memory cell of carry distributed file system, realizes depositing of data Storage.
5) assume that calculation stages expand to 64 nodes, recalculate hash distribution, obtain new nodemap, each node To should have 1024 hash buckets:
n1:H0, h64, h128 ...
n2:H1, h65, h129 ...
……
n64:H63, h127, h191 ...
6) each node is according to new nodemap, the corresponding memory cell of carry distributed file system again, thus real Show data from 64 nodes to the redistribution of 128 nodes.
Above embodiments of the invention are described in detail, but described content have been only presently preferred embodiments of the present invention, It is not to be regarded as the practical range for limiting the present invention.All impartial changes made according to the scope of the invention and improvement etc., all should Still belong within this patent covering scope.

Claims (3)

1. a kind of MPP database data redistribution system based on shared storage it is characterised in that:Including shared memory systems, MPP cluster management node and the cluster distributed calculate node of MPP,
Described MPP cluster management node, is responsible for storage and calculate node management,
The cluster distributed calculate node of described MPP, is responsible for accepting described MPP cluster management node instruction, executes Distributed Calculation Node.
2. a kind of MPP database data redistribution system based on shared storage according to claim 1, its feature exists In:Described shared memory systems include but is not limited to DAS, NAS, SAN file system distributed or non-distributed storage system System.
3. a kind of MPP database data redistribution system based on shared storage according to claim 1, its feature exists In:Described node administration includes associate management, the management of hash bucket.
CN201610777712.3A 2016-08-31 2016-08-31 Shared storage-based MPP database data redistribution system Pending CN106383845A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610777712.3A CN106383845A (en) 2016-08-31 2016-08-31 Shared storage-based MPP database data redistribution system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610777712.3A CN106383845A (en) 2016-08-31 2016-08-31 Shared storage-based MPP database data redistribution system

Publications (1)

Publication Number Publication Date
CN106383845A true CN106383845A (en) 2017-02-08

Family

ID=57938502

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610777712.3A Pending CN106383845A (en) 2016-08-31 2016-08-31 Shared storage-based MPP database data redistribution system

Country Status (1)

Country Link
CN (1) CN106383845A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106960060A (en) * 2017-04-10 2017-07-18 聚好看科技股份有限公司 The management method and device of a kind of data-base cluster
CN109901948A (en) * 2019-02-18 2019-06-18 国家计算机网络与信息安全管理中心 Shared-nothing database cluster strange land dual-active disaster tolerance system
CN110162574A (en) * 2019-05-27 2019-08-23 上海达梦数据库有限公司 Determination method, apparatus, server and the storage medium of fast resampling mode

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101311917A (en) * 2007-05-24 2008-11-26 中国科学院过程工程研究所 Particle model faced multi-tier direct-connection cluster paralleling computing system
CN101441616A (en) * 2008-11-24 2009-05-27 中国人民解放军信息工程大学 Rapid data exchange structure based on register document and management method thereof
CN103268261A (en) * 2012-02-24 2013-08-28 苏州蓝海彤翔系统科技有限公司 Hierarchical computing resource management method suitable for large-scale high-performance computer
CN105009110A (en) * 2012-11-30 2015-10-28 华为技术有限公司 Method for automated scaling of massive parallel processing (mpp) database

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101311917A (en) * 2007-05-24 2008-11-26 中国科学院过程工程研究所 Particle model faced multi-tier direct-connection cluster paralleling computing system
CN101441616A (en) * 2008-11-24 2009-05-27 中国人民解放军信息工程大学 Rapid data exchange structure based on register document and management method thereof
CN103268261A (en) * 2012-02-24 2013-08-28 苏州蓝海彤翔系统科技有限公司 Hierarchical computing resource management method suitable for large-scale high-performance computer
CN105009110A (en) * 2012-11-30 2015-10-28 华为技术有限公司 Method for automated scaling of massive parallel processing (mpp) database

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
马双良: ""集群测控系统设计与关键性技术研究"", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106960060A (en) * 2017-04-10 2017-07-18 聚好看科技股份有限公司 The management method and device of a kind of data-base cluster
CN106960060B (en) * 2017-04-10 2020-07-31 聚好看科技股份有限公司 Database cluster management method and device
CN109901948A (en) * 2019-02-18 2019-06-18 国家计算机网络与信息安全管理中心 Shared-nothing database cluster strange land dual-active disaster tolerance system
CN110162574A (en) * 2019-05-27 2019-08-23 上海达梦数据库有限公司 Determination method, apparatus, server and the storage medium of fast resampling mode

Similar Documents

Publication Publication Date Title
CN110022226A (en) A kind of data collection system and acquisition method based on object-oriented
CN109739919B (en) Front-end processor and acquisition system for power system
CN110225074B (en) Communication message distribution system and method based on equipment address domain
CN107102824B (en) A kind of Hadoop isomery method and system based on storage and acceleration optimization
CN106383845A (en) Shared storage-based MPP database data redistribution system
CN101071434B (en) User distributing method, device and system for distributed database system
CN101815095B (en) A kind of SAN stored resource unified management and distribution method
CN104683161B (en) Network management and device based on SaaS
CN104156216A (en) Heterogeneous storage management system and method oriented to cloud computing
CN103631924B (en) A kind of application process and system of distributive database platform
CN106339475A (en) Distributed storage system for mass data
CN104601680B (en) A kind of method for managing resource and device
CN104050276A (en) Cache processing method and system of distributed database
CN103516802A (en) Method and device for achieving seamless transference of across heterogeneous virtual switch
CN103150304A (en) Cloud database system
CN103200020A (en) Resource allocating method and resource allocating system
CN105426482B (en) A kind of railway 10 kV power distribution net magnanimity monitoring information HBase dump methods of picture library one
CN101620702A (en) Expenditure presentation processing method, device and system
CN105589881A (en) Data processing method and device
CN111030983B (en) Data processing method and device based on distributed distribution and related equipment
CN204740299U (en) Electric energy quality intelligent monitoring system based on cloud calculates
CN113343489B (en) Satellite communication simulation method and system based on container technology and digital twin technology
CN105516284A (en) Clustered database distributed storage method and device
CN105187503A (en) Data partition supporting service connection method and system
CN105681414A (en) Method and system for avoiding data hotspot of Hbase

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20170208