CN103034739A - Distributed memory system and updating and querying method thereof - Google Patents

Distributed memory system and updating and querying method thereof Download PDF

Info

Publication number
CN103034739A
CN103034739A CN2012105941055A CN201210594105A CN103034739A CN 103034739 A CN103034739 A CN 103034739A CN 2012105941055 A CN2012105941055 A CN 2012105941055A CN 201210594105 A CN201210594105 A CN 201210594105A CN 103034739 A CN103034739 A CN 103034739A
Authority
CN
China
Prior art keywords
node
data
distributed memory
memory system
safegroup
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2012105941055A
Other languages
Chinese (zh)
Inventor
任景彪
孟祥斌
施宁
崔维力
武新
赵伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
TIANJIN NANDA GENERAL DATA TECHNOLOGY Co Ltd
Original Assignee
TIANJIN NANDA GENERAL DATA TECHNOLOGY Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by TIANJIN NANDA GENERAL DATA TECHNOLOGY Co Ltd filed Critical TIANJIN NANDA GENERAL DATA TECHNOLOGY Co Ltd
Priority to CN2012105941055A priority Critical patent/CN103034739A/en
Publication of CN103034739A publication Critical patent/CN103034739A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a distributed memory system, which comprises a safety group consisting of at least one node, wherein the node is used for saving data and can be used for saving copies of other nodes in the same safety group. In addition, the invention also provides an updating and querying method aiming at the distributed memory system. The distributed memory system has the beneficial effects that the system has high usability, high query parallelism and management and extension capacity of large data volume and the management cost and the maintenance cost can be effectively reduced.

Description

A kind of distributed memory system and renewal thereof and querying method
Technical field
The invention belongs to field of data storage, especially relate to a kind of distributed memory system that possesses high availability.
Background technology
In the face of the fulminant growth of data volume, also there are a lot of technical matterss in the urgent need to address in current database technology.For database, record the just the most basic requirement of correct result, outside this, database also will accomplish how to improve processing speed, availability of data, the security of data and the dilatancy of data set.Large data are had higher requirement to above four aspects.
According to physical laws, improving redundance is the unique channel that improves availability of data.At present mainly be the redundancy by hardware-level, the redundancy of communication link level, the redundancy of software and data redundancy.For improving the usually implementation method of usage data redundancy of high availability.
Traditional data redundancy scheme has following several, and weak point is respectively arranged:
1, passes through data Replica, all present Data Replication Technology in Mobile (synchronous or asynchronous), for example disk mirroring (EMC TimeFinder series), database file copy the dbbackup utility that (such as DoubleTake, Veritas and Legato) and database manufacturer carry and all can only produce the passive replication data set.Usually, in order to realize copy function, need to consume master server 5%(asynchronous) to 30%(synchronous) processing power.The data of passive renewal are general only for disaster recovery.Passive update data set also has two fatal problems: in case the master processor fault causes corrupted data, the data set of passive renewal also can be destroyed.The time that this method easily makes system be in single node danger increases, and has reduced the utilization factor of system.
2, the two-node cluster hot backup of complete machine backup.This mode all is that a station server is moving at any time, although the response speed of system does not reduce, the utilization factor of system has but descended 50%.
3, asynchronous Active Replication data set: this technology is to give master server issued transaction first to finish, and then these issued transaction are given serially backup server again and guaranteed data's consistency to carry out same operation.Data set and master data set that this technology generates have a mistiming, so be only applicable to disaster recovery, data mining, report form statistics and limited online application.All commercial data bases are all supported asynchronous Active Replication technology.The difficulty of this way is in the management of replication queue, and this formation is to shield speed difference between master server and the backup server.Because master server can utilize the concurrency of all software and hardwares to process concurrent affairs as much as possible, and backup server can only copy serially, in the situation that the high load capacity issued transaction, replication queue often may be overflowed.Because control the speed of transaction request without any way, in the situation that the high load capacity issued transaction, replication queue can only be rebuild regularly.
4, synchronous Active Replication data set: all concurrent transactions of this technical requirement are processed and are finished simultaneously on all database servers.A direct benefit is exactly the problem of management that has not had formation, also can realize higher performance and the availability of Geng Gao by load balancing simultaneously.This technology also has two kinds of diverse implementation methods: complete serialization and dynamic serializing.Complete serialized issued transaction comes from the issued transaction engine of master data base, RAC, UDB, MSCS (SQL Server2005) and ASE are finished full serial and realize in conjunction with two-phase commitment protocol, and the target of this design is exactly the data set that can be used for quick disaster recovery in order to obtain portion.There is the problem of two keys in this system.The first, two-phase commitment protocol is the agreement of a kind of " ALL OR NOTHING ".Just can find after scrutinizing two-phase commitment protocol, in order to obtain this backup data set, the availability of issued transaction can reduce half.The second, complete serialized way has been introduced again the unmatched problem of MS master-slave database server speed.Forced synchronism causes the speed of whole system to be lowered to complete serialized level.
Summary of the invention
The problem to be solved in the present invention provides a kind of distributed memory system, especially is fit to the high availability storage of Large Volume Data.
For solving the problems of the technologies described above, the technical solution used in the present invention is: a kind of distributed memory system, described storage system comprise the secure group (SafeGroup) that is comprised of at least one node, and described node is in order to save data.
Further, described node can be preserved the copy of other node in the same secure group (SafeGroup).
Further, described copy is mirror back-up.
Further, all nodes are all preserved identical sheet data in the described same secure group (SafeGroup).
Further, select the sheet data of the burst of numbering equally by own node serial number be the master data of self node to described node.
The affairs numbering that system's overall situation separately further, is arranged on the data on described each node.
According to another aspect of the present invention, the present invention also provides a kind of querying method of distributed memory system, comprising:
Query requests is sent to system;
Initiate node according to the situation generated query plan of enabled node in query requests and the system;
Initiate node inquiry plan is sent to each enabled node in the system;
Each enabled node is the interior master data of this node of computing separately;
Each enabled node returns operation result to the initiation node and gathers.
According to another aspect of the present invention, the present invention also provides a kind of update method of distributed memory system, comprising:
For system adds distributed lock;
The application system global transaction is numbered, and update request is sent to the node at main burst and all mirror image burst places again;
The burst of each node updates oneself, and the data of new adding are stamped global transaction number mark;
Last release profile formula lock.
Owing to adopting technique scheme, so that individual node operates the data of oneself separately, can take full advantage of node processing power separately.Thereby so that system has management and the extended capability of high availability, high inquiry degree of parallelism and large data capacity, because the granularity of data trnascription is identical with the granularity of hardware NATURAL DISTRIBUTION, thereby also can effectively reduces and administer and maintain cost.
Description of drawings
Fig. 1 is the schematic diagram of an embodiment of a kind of distributed memory system of the present invention
Fig. 2 is each node storage schematic diagram data in the secure group (SafeGroup) in one embodiment of the present of invention
Embodiment
Fig. 1 is the schematic diagram of one embodiment of the present of invention, as seen from Figure 1, this embodiment comprises three secure group, each secure group respectively comprises three nodes, wherein each node possesses certain transaction capabilities, node in each system can be connected to each other by network or alternate manner, can access mutually by network or other connected mode.Each secure group (SafeGroup) also is connected to each other, and also can access mutually.
Fig. 2 is the schematic diagram of the content of storing in included three nodes in a certain secure group (SafeGroup) among the embodiment, as seen from Figure 2, each node is also preserved the mirror image data of other node in the native system except the data that store self, and the content of storage is identical.
The metadata of the present embodiment comprises: the tabulation of all enabled nodes in the system, and the mapping table between all nodes and its place secure group, and current maximum global transaction numbering.With respect to Hadoop HDFS(distributed file system), the data trnascription of the present embodiment is take node as granularity; And the data trnascription of HDFS is take piecemeal (general 64MB is as a data block) as granularity.Just because of this, the metadata of HDFS need to be managed the corresponding relation of all data blocks and copy and its place node, and when the data scale of HDFS management was very large, the scale of its metadata was also considerable.Thereby the cost of HDFS when management of metadata is higher, when System Expansion, capacity reducing and data heavily distribute, needs migration and revises a large amount of metadata.In the present embodiment, owing to the granularity identical (node level) of data trnascription granularity and system physical deploy, need not to store the corresponding relation of any data and node, thereby metadata is very simple.When System Expansion, capacity reducing and data heavily distribute, only need revise node listing, node and secure group mapping table, the metadata maintenance cost greatly reduces.
Simultaneously, the metadata of lightweight can the interior formula of adaptive zone be disposed the metadata management strategy of the present embodiment, also can the outer formula of adaptive zone dispose; And for known schemes (such as HDFS), too the metadata of heavyweight almost defines its deployment way and can only take to be with outer formula metadata management scheme.
The present embodiment is realized the synchronous of distributed lock and metadata by a kind of realization of increasing income (TOTEM: a kind of communication of based on token ring and distributed consensus agreement) of Paxos agreement.
The present embodiment can effectively reduce project management and maintenance cost.When doing the distributed query plan, do not need as Hadoop, obtain Data Position from meta data server, and only need simply same inquiry plan is sent to all nodes in the secure group, they separately computing wherein a part of data get final product, do not have the "bottleneck" problem of meta data server.
The present embodiment adds first distributed lock when new data more, and obtains the global transaction numbering, and update request is sent to needs the data fragmentation that upgrades and the respective nodes at mirror image burst place thereof again.Each node executed in parallel is upgraded operation, and numbers mark for the data of this renewal arrange global transaction.If the renewal operation failure of certain node then is set to this node " unavailable " state.The node of " unavailable " state carries out the data synchronous working on the backstage, when its data return to enabled node equal state on the same group, recover its upstate.When each node that relates to when current renewal was updated successfully or is set to " unavailable " state, Data Update was complete, release profile formula lock.
The present embodiment is in when inquiry, the set of obtaining first all available nodes from system metadata; Again query requests is sent to these enabled nodes, for the node of " unavailable " state, will be redirected on the same group other enabled nodes to the query requests of its main fragment data.
Carried out in advance level during this example storage data fragmentation and cut apart, every generation 2GB data with regard to horizontal split once.When doing dilatation or capacity reducing, only need the burst of mobile these 2GB, so that distributing, final data are similar to evenly, can realize that data heavily distribute, metadata then needs to revise hardly.This example can be served relational model (because level is cut apart the field constraint that can not break relational model), but is not limited to relational model.As long as have the data model of similar demand to be suitable for to the data distribution characteristics.
Can be found out by above, the present invention has significantly improved for following four aspects:
1, improve processing speed: the number of node in the increase system, namely increase the data trnascription number of system, the total number of copies of system is more, and the data volume of single copy can be fewer, and concurrent processing speed is faster.
2, availability of data: during Data Update, only need a node success in the secure group, the data of renewal are namely available; During inquiry, under normal circumstances, can reach maximum performance by the concurrent computing of a plurality of nodes, when having node unavailable, can be redirected to other nodes to the computing to this node master burst.
3, the security of data: identical with the data security of one-of-a-kind system.
4, the dilatancy of data set: in the expansion data, the mode that data are pressed with the direct copying data file is moved, and need not to safeguard in a large number metadata, greatly reduces the running cost of system extension.
Above one embodiment of the present of invention are had been described in detail, but described content only is preferred embodiment of the present invention, can not be considered to be used to limiting practical range of the present invention.All equalizations of doing according to the present patent application scope change and improve etc., all should still belong within the patent covering scope of the present invention.

Claims (9)

1. distributed memory system, it is characterized in that: described storage system comprises the secure group (SafeGroup) that is comprised of at least one node, and described node is in order to save data.
2. distributed memory system according to claim 1 is characterized in that: described node can be preserved the copy of other node in the same secure group (SafeGroup).
3. distributed memory system according to claim 2, it is characterized in that: described copy is mirror back-up.
4. distributed memory system according to claim 2, it is characterized in that: the data on the described node are divided into the sheet number identical with same secure group (SafeGroup) interior nodes number.
5. distributed memory system according to claim 4 is characterized in that: all nodes are all preserved identical sheet data in the described same secure group (SafeGroup).
6. distributed memory system according to claim 5 is characterized in that: described node is selected the sheet data of the burst of same numbering by own node serial number be the master data of self node.
7. distributed memory system according to claim 6 is characterized in that: the affairs numbering that system's overall situation is separately arranged on the data on described each node.
8. the update method of a distributed memory system as claimed in claim 7 comprises:
For system adds distributed lock;
The application system global transaction is numbered, and update request is sent to the node at main burst and all mirror image burst places again;
The burst of each node updates oneself, and the data of new adding are stamped global transaction number mark;
Last release profile formula lock.
9. the querying method of a distributed memory system as claimed in claim 6 comprises:
Query requests is sent to system;
Query requests is issued certain node in the system (initiation node);
Initiate node according to enabled node situation generated query plan in query requests and the system;
Initiate node inquiry plan is sent to each enabled node in the system;
Each enabled node is the interior master data of this node of computing separately;
Each enabled node returns operation result to the initiation node and gathers.
CN2012105941055A 2012-12-29 2012-12-29 Distributed memory system and updating and querying method thereof Pending CN103034739A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2012105941055A CN103034739A (en) 2012-12-29 2012-12-29 Distributed memory system and updating and querying method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2012105941055A CN103034739A (en) 2012-12-29 2012-12-29 Distributed memory system and updating and querying method thereof

Publications (1)

Publication Number Publication Date
CN103034739A true CN103034739A (en) 2013-04-10

Family

ID=48021633

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2012105941055A Pending CN103034739A (en) 2012-12-29 2012-12-29 Distributed memory system and updating and querying method thereof

Country Status (1)

Country Link
CN (1) CN103034739A (en)

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103345502A (en) * 2013-07-01 2013-10-09 曙光信息产业(北京)有限公司 Transaction processing method and system of distributed type database
CN103686224A (en) * 2013-12-26 2014-03-26 乐视网信息技术(北京)股份有限公司 Method and system for transcoding task obtaining on basis of distributed locks
CN104639661A (en) * 2015-03-13 2015-05-20 华存数据信息技术有限公司 Distributed storage system and storing and reading method for files
CN105205160A (en) * 2015-09-29 2015-12-30 浙江宇视科技有限公司 Data write-in method and device
CN105426427A (en) * 2015-11-04 2016-03-23 国家计算机网络与信息安全管理中心 MPP database cluster replica realization method based on RAID 0 storage
CN105630626A (en) * 2014-11-03 2016-06-01 中兴通讯股份有限公司 Transaction backup processing method and device
CN105808612A (en) * 2014-12-31 2016-07-27 北京嘀嘀无限科技发展有限公司 Method and equipment used for migrating data of database
CN106354828A (en) * 2016-08-31 2017-01-25 天津南大通用数据技术股份有限公司 Data fragmentation method and device for distributed database
CN106372160A (en) * 2016-08-31 2017-02-01 天津南大通用数据技术股份有限公司 Distributive database and management method
CN107395745A (en) * 2017-08-20 2017-11-24 长沙曙通信息科技有限公司 A kind of distributed memory system data disperse Realization of Storing
CN107545005A (en) * 2016-06-28 2018-01-05 华为软件技术有限公司 A kind of data processing method and device
CN107633090A (en) * 2017-09-29 2018-01-26 郑州云海信息技术有限公司 A kind of method split based on distributed type file system client side lock
WO2018027466A1 (en) * 2016-08-08 2018-02-15 马岩 Method and system for storing big data in distributed system
CN109753243A (en) * 2018-12-26 2019-05-14 深圳市网心科技有限公司 Copy dispositions method, Cloud Server and storage medium
CN109815303A (en) * 2018-12-29 2019-05-28 哈尔滨工业大学(深圳) A kind of location-based mobile data storage system
CN110334823A (en) * 2019-06-17 2019-10-15 北京大米科技有限公司 Reserving method, device, electronic equipment and medium
CN110928481A (en) * 2018-09-19 2020-03-27 中国银联股份有限公司 Distributed deep neural network and storage method of parameters thereof
CN114884961A (en) * 2022-04-21 2022-08-09 京东科技信息技术有限公司 Distributed lock handover method, apparatus, electronic device, and computer-readable medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101616184A (en) * 2008-06-27 2009-12-30 阿尔卡特朗讯公司 Method of redundant data storage
CN101971168A (en) * 2008-04-17 2011-02-09 美国日本电气实验室公司 Dynamically quantifying and improving the reliability of distributed data storage systems
US20110296104A1 (en) * 2009-02-17 2011-12-01 Kenji Noda Storage system
CN102307221A (en) * 2011-03-25 2012-01-04 国云科技股份有限公司 Cloud storage system and implementation method thereof

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101971168A (en) * 2008-04-17 2011-02-09 美国日本电气实验室公司 Dynamically quantifying and improving the reliability of distributed data storage systems
CN101616184A (en) * 2008-06-27 2009-12-30 阿尔卡特朗讯公司 Method of redundant data storage
US20110296104A1 (en) * 2009-02-17 2011-12-01 Kenji Noda Storage system
CN102307221A (en) * 2011-03-25 2012-01-04 国云科技股份有限公司 Cloud storage system and implementation method thereof

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103345502A (en) * 2013-07-01 2013-10-09 曙光信息产业(北京)有限公司 Transaction processing method and system of distributed type database
CN103345502B (en) * 2013-07-01 2017-04-26 曙光信息产业(北京)有限公司 Transaction processing method and system of distributed type database
CN103686224A (en) * 2013-12-26 2014-03-26 乐视网信息技术(北京)股份有限公司 Method and system for transcoding task obtaining on basis of distributed locks
CN105630626A (en) * 2014-11-03 2016-06-01 中兴通讯股份有限公司 Transaction backup processing method and device
CN105808612A (en) * 2014-12-31 2016-07-27 北京嘀嘀无限科技发展有限公司 Method and equipment used for migrating data of database
CN105808612B (en) * 2014-12-31 2019-08-27 北京嘀嘀无限科技发展有限公司 The method and apparatus of data for migrating data library
CN104639661A (en) * 2015-03-13 2015-05-20 华存数据信息技术有限公司 Distributed storage system and storing and reading method for files
CN105205160A (en) * 2015-09-29 2015-12-30 浙江宇视科技有限公司 Data write-in method and device
CN105426427A (en) * 2015-11-04 2016-03-23 国家计算机网络与信息安全管理中心 MPP database cluster replica realization method based on RAID 0 storage
CN107545005A (en) * 2016-06-28 2018-01-05 华为软件技术有限公司 A kind of data processing method and device
WO2018027466A1 (en) * 2016-08-08 2018-02-15 马岩 Method and system for storing big data in distributed system
CN106354828A (en) * 2016-08-31 2017-01-25 天津南大通用数据技术股份有限公司 Data fragmentation method and device for distributed database
CN106372160A (en) * 2016-08-31 2017-02-01 天津南大通用数据技术股份有限公司 Distributive database and management method
CN107395745A (en) * 2017-08-20 2017-11-24 长沙曙通信息科技有限公司 A kind of distributed memory system data disperse Realization of Storing
CN107633090A (en) * 2017-09-29 2018-01-26 郑州云海信息技术有限公司 A kind of method split based on distributed type file system client side lock
CN110928481A (en) * 2018-09-19 2020-03-27 中国银联股份有限公司 Distributed deep neural network and storage method of parameters thereof
CN109753243A (en) * 2018-12-26 2019-05-14 深圳市网心科技有限公司 Copy dispositions method, Cloud Server and storage medium
CN109815303A (en) * 2018-12-29 2019-05-28 哈尔滨工业大学(深圳) A kind of location-based mobile data storage system
CN110334823A (en) * 2019-06-17 2019-10-15 北京大米科技有限公司 Reserving method, device, electronic equipment and medium
CN114884961A (en) * 2022-04-21 2022-08-09 京东科技信息技术有限公司 Distributed lock handover method, apparatus, electronic device, and computer-readable medium
CN114884961B (en) * 2022-04-21 2024-04-16 京东科技信息技术有限公司 Distributed lock handover method, apparatus, electronic device, and computer readable medium

Similar Documents

Publication Publication Date Title
CN103034739A (en) Distributed memory system and updating and querying method thereof
US11360854B2 (en) Storage cluster configuration change method, storage cluster, and computer system
CN101334797B (en) Distributed file systems and its data block consistency managing method
US10817478B2 (en) System and method for supporting persistent store versioning and integrity in a distributed data grid
US8140498B2 (en) Distributed database system by sharing or replicating the meta information on memory caches
US20150278030A1 (en) Distributed Database Synchronization Method and System
US8874505B2 (en) Data replication and failure recovery method for distributed key-value store
US20160210340A1 (en) System and Method for Massively Parallel Processor Database
CN102955845B (en) Data access method, device and distributed data base system
CN102904949B (en) Replica-based dynamic metadata cluster system
CN104331478A (en) Data consistency management method for self-compaction storage system
EP2643771B1 (en) Real time database system
CN107623703B (en) Synchronization method, device and system for Global Transaction Identifier (GTID)
CN104410531A (en) Redundant system architecture approach
KR101527634B1 (en) Method and apparatus for providing sharding service
Gao et al. An efficient ring-based metadata management policy for large-scale distributed file systems
CN110704541A (en) High-availability distributed method and architecture for Redis cluster multi-data center
US20140358852A1 (en) Method of synchronizing data within database clusters
CN114925075B (en) Real-time dynamic fusion method for multi-source time-space monitoring information
CN110442573A (en) A kind of method and device of distributed fault-tolerance key assignments storage
CN105868045A (en) Data caching method and apparatus
Santos et al. Leveraging 24/7 availability and performance for distributed real-time data warehouses
Klein et al. Dxram: A persistent in-memory storage for billions of small objects
CN110850956B (en) Distributed operation cluster dynamic energy consumption management method based on data coverage set
CN112667440A (en) Long-distance disaster recovery method for high-availability MySQL

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C53 Correction of patent of invention or patent application
CB02 Change of applicant information

Address after: Haitai 300384 in Tianjin Binhai high tech Zone Huayuan Industrial Zone Development six road No. 6 Haitai green industry base J

Applicant after: Tianjin NanKai University General Data Technologies Co., Ltd.

Address before: Haitai 300384 in Tianjin Binhai high tech Zone Huayuan Industrial Zone Development six road No. 6 Haitai green industry base J

Applicant before: Tianjin Nanda General Data Technology Co., Ltd.

CB03 Change of inventor or designer information

Inventor after: Ren Jingbiao

Inventor after: Meng Xiangbin

Inventor after: Shi Ning

Inventor after: Cui Weili

Inventor after: Wu Xin

Inventor after: Zhao Wei

Inventor before: Ren Jingbiao

Inventor before: Meng Xiangbin

Inventor before: Shi Ning

Inventor before: Cui Weili

Inventor before: Wu Xin

Inventor before: Zhao Wei

COR Change of bibliographic data

Free format text: CORRECT: APPLICANT; FROM: TIANJIN NANDA GENERAL DATA TECHNOLOGY CO., LTD. TO: TIANJIN NANDA CONVENTIONAL DATA TECHNOLOGY CO., LTD.

RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20130410