CN102693312B - Flexible transaction management method in key-value store data storage - Google Patents

Flexible transaction management method in key-value store data storage Download PDF

Info

Publication number
CN102693312B
CN102693312B CN201210169301.8A CN201210169301A CN102693312B CN 102693312 B CN102693312 B CN 102693312B CN 201210169301 A CN201210169301 A CN 201210169301A CN 102693312 B CN102693312 B CN 102693312B
Authority
CN
China
Prior art keywords
data
key assignments
daily record
coordination module
storehouse
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201210169301.8A
Other languages
Chinese (zh)
Other versions
CN102693312A (en
Inventor
王建民
丁贵广
朱妤晴
衣国垒
杨义繁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN201210169301.8A priority Critical patent/CN102693312B/en
Publication of CN102693312A publication Critical patent/CN102693312A/en
Application granted granted Critical
Publication of CN102693312B publication Critical patent/CN102693312B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention belongs to the technical field of computer database management, and particularly relates to a flexible transaction management method in key-value store data storage, which includes the following steps: when data is written, a coordinative module packages a request into a log and obtains the existing log position, the Parkes algorithm is used to write the log into the new log position, the position is recorded, the successful operation information is returned, and the data and the log position are written into the data storage; when the data is read, the coordinative module obtains the newest log position and checks whether the data is updated, if the data is updated, the data is read and returned to the user, and if the data is not updated, the log is read, and the data is then corrected and read and returned to the user. The flexible transaction management method improves the concurrency, the fault tolerance and the expansibility of the key-value store data storage, can narrow the limitation range of the transaction as much as possible under the circumstance that the system consistency is ensured, and improves the concurrency of the system; and the design of the flexible transaction has high positive function in improving the flexibility and the adaptability of the database transaction.

Description

Flexible office management method in the storage of a kind of key assignments database data
Technical field
The present invention relates to flexible office management method in the storage of a kind of key assignments database data, belong to computer data base management technical field.
Background technology
In recent years, the development of online interaction formula service platform is rapid, as the field such as social networks, Email.The formation of users group's existence and mass users generating content (User-Generated Content) has expedited the emergence of system platform enhanced scalability and high concurrent requirement.Moreover, the continuous service requirement of internet, applications to online service, makes system platform have to provide the service with high availability and fault-tolerance." concept of cloud computing is just being answered this development trend and is being given birth to.
Aspect data management, these application demands have proposed the requirement of enhanced scalability and high availability to data management system.Although widely network service is used for full-fledged traditional relational database, aspect enhanced scalability and high availability, be difficult to assurance.Along with " the representative cloud storage system of a class, has appearred in the proposition of cloud computing concept aspect storage, i.e. key assignments storehouse (Key-value store is also called NoSQL DB).The relational model of traditional database has been abandoned in key assignments storehouse, and adopts the simple data model based on key-value pair, has sacrificed the database features of part as transactions access, to improve extensibility and the fault-tolerance of data-storage system.This type of key assignments storehouse system is used widely in actual internet, as the large table (Bigtable) for Google service, for the simple data storehouse (SimpleDB) of Amazon service, for the Pi Naci (PNUTS) of Yahoo's service, for the types of facial makeup in Beijing operas (Facebook) with push away the Cassandra (Cassandra) that spy (Twitter) serves.
Cassandra is as the Typical Representative in the storage of key assignments database data, and compared with relational database, advantage is that data model is simple, has high scalability, availability and fault-tolerance, and the application development interface that is simple and easy to use is provided.Cassandra is based on reciprocity the Internet architecture (peer-to-peer, P2P) storage system, feature is each computing machine (i.e. " node " in storage system, all have reciprocity status down together), be responsible for separately storage and the backup of a part of data, do not exist the resource of single node control whole system to distribute.Adopt reciprocity the Internet architecture to be conducive to improve concurrency, fault-tolerance and the extendability of system.Because each nodal function is identical, request of data that can both responding system outside, all nodes are response external request of data concurrently, can improve the data throughout of system, and good concurrency is provided.Improving fault-tolerance refers to, in the time that part of nodes is made mistakes cisco unity malfunction (i.e. " inefficacy "), because data have backup on other nodes, and the identical easy phase trans-substitution of nodal function, so can substitute the request of failure node response external by the node of other normal work, system can keep external normal response, and the fault freedom of system is improved.Because each nodal function is identical, so log off or when new node adds system, can not change the not flat structure generation of system at original node, only the data in system need to be redistributed again, therefore system is with good expansibility.
P2P framework is Cassandra system when bringing high concurrency, fault-tolerance and extendability, has also brought some shortcomings, and wherein the most important is the final consistency that it can only realize data, not supporting database transaction functionality.Data consistency is requirement different user while accessing at the same time the same data of Database Systems, should obtain identical (being consistent) data content.Under distributed environment, data need to have multiple backups conventionally, cause the loss of data of system, but also brought the problem that how to keep data consistent between multiple backups simultaneously to prevent that individual node lost efficacy.The all backups that require data in full accord of desirable data all have identical data content at synchronization.And the final consistency that Cassandra realizes refers to, system can not guarantee that each moment data of different backups are all consistent, and the system that can only guarantee is consistent in the data of steady state (SS) after considerable time.This is a kind of consistance, and in system operation, likely different user is accessed same data simultaneously and can be obtained different returning results.Db transaction refers in database the sequence of operations of carrying out as single logical unit of work, these operations comprise data inserting, more new data, delete data etc.Db transaction mechanism can guarantee that all operations in transaction units is all successfully completed, or all operations does not carry out.Traditional Database Requirements affairs have atomicity (all operations or all success, all unsuccessfully), consistance (result of affairs execution must be to make database change to another state from a state), isolation (operation of different affairs is independent of each other mutually), persistence (affairs of successful execution can not be lost the change of data), this four large characteristic is referred to as ACID characteristic.Cassandra is merely able to realization finally to carry out, and therefore can not meet the requirement of database to affairs, and this can make troubles to upper layer application developer, because they must consider to solve in application layer the system inconsistence problems that multiple user concurrent access bring.
Summary of the invention
The object of the invention is to propose flexible office management method in the storage of a kind of key assignments database data, make the data area of transaction management specify and to change according to user, and facilitate user in the time of usage data storehouse, to keep the consistance of data.
Flexible office management method in the key assignments database data storage that the present invention proposes, comprises the following steps:
The ablation process of data:
(1) data that write key assignments storehouse are submitted to Coordination module by user, writes in the data in key assignments storehouse with key assignments storehouse line unit, and user's the data that write key assignments storehouse and write operation are encapsulated into daily record by Coordination module;
(2) Coordination module is obtained current up-to-date daily record position from Version Control module, on up-to-date daily record position, adds 1, and what obtain step (1) daily record writes daily record position, and daily record position is described by the data line in the storage of key assignments database data;
(3) Coordination module is according to the key assignments storehouse line unit of step (1), and according to key assignments library backup data rule, N the log memory of the log store that calculates step (1) in key assignments storehouse, and wherein N is more than or equal to 3;
(4) Coordination module adopts Orion Pax consistency algorithm, and the daily record of step (1) is left in N the log memory that step (3) calculates;
(5) Coordination module writes the daily record position that writes of above-mentioned steps (2) in Version Control module, and Coordination module is returned to write operation successful information to user;
(6) Coordination module is according to the key assignments storehouse line unit of step (1), and according to key assignments library backup data rule, the data that calculate step (1) are stored in M data-carrier store in key assignments storehouse, and wherein M is more than or equal to 3;
(7) data that Coordination module is submitted user in above-mentioned steps (1) to and the daily record position that writes of above-mentioned steps (2) write M data-carrier store;
The process that reads of data:
(8) user submits the request of reading out data to Coordination module, and this request comprises needs the line unit of reading out data in key assignments storehouse;
(9) Coordination module is obtained the up-to-date daily record position P corresponding with the line unit of step (8) from Version Control module 1;
(10) Coordination module, according to key assignments library backup data rule, calculates U the data-carrier store at the line unit place of step (8), and a data-carrier store S from U data-carrier store obtains the daily record position P corresponding with the line unit of step (8) 2;
(11) Coordination module is to above-mentioned two daily record position P 1and P 2compare:
If P 1=P 2, the data of Coordination module line unit of obtaining step (8) from the data-carrier store S of step (10), and the data of obtaining are returned to request user;
If P 1>P 2coordination module is according to key assignments library backup data rule, calculate V the log memory at the line unit place of step (8), and a log memory T from V log memory obtains the daily record corresponding with the line unit of step (8), meanwhile, Coordination module is according to the current log content in log memory T, the data in Update Table storer S, and from data-carrier store S, obtain with the row of step (8) and be good for corresponding data, these data are returned to request user.
Flexible office management method in the key assignments database data storage that the present invention proposes, its advantage is:
1, flexible office management method in key assignments database data of the present invention storage, has adopted the reciprocity the Internet architecture (P2P) in prior art, has improved concurrency, fault-tolerance and the extendability of the storage of key assignments database data.
2, flexible office management method in key assignments database data of the present invention storage, adopts the management method of flexible affairs, supports the data cell of transactional attribute dynamically to adjust.Affairs can be both the affairs within a line, guarantee that the multiple row of the read-write of a line are had to atomicity, the consistance that can keep data simultaneously concurrent in the situation that, can be also the affairs based on group of entities across multirow, guarantees the ACID characteristic of Data Update in this group of entities.Affairs limit the flexibility of data area, bring great convenience to database user, user can, according to the size that need to limit neatly group of entities of application, can dwindle as best one can the scope that affairs limit, to improve the concurrency of system in the guaranteed situation of system conformance.The design of flexible affairs has very positive effect to the dirigibility, the adaptability that improve db transaction, is also great advantage of the present invention.
Accompanying drawing explanation
Fig. 1 is that the module in the inventive method is called schematic diagram.
Embodiment
Flexible office management method in the key assignments database data storage that the present invention proposes, its each module is called schematic diagram as shown in Figure 1, comprises the following steps:
The ablation process of data:
(1) data that write key assignments storehouse are submitted to Coordination module by user, writes in the data in key assignments storehouse with key assignments storehouse line unit, and user's the data that write key assignments storehouse and write operation are encapsulated into daily record by Coordination module;
(2) Coordination module is obtained current up-to-date daily record position from Version Control module, on up-to-date daily record position, adds 1, and what obtain step (1) daily record writes daily record position, and daily record position is described by the data line in the storage of key assignments database data;
(3) Coordination module is according to the key assignments storehouse line unit of step (1), and according to key assignments library backup data rule, N the log memory of the log store that calculates step (1) in key assignments storehouse, wherein N is more than or equal to 3, key assignments library backup data rule mainly contains two kinds, can be specified by user, and one is Random assignment, another kind is the size sequence corresponding according to data line unit, and each backup in key assignments storehouse is responsible for storing the data of a certain size scope.Key assignments library backup data rule, specifically can be referring to Cassandra configuration instruction;
(4) Coordination module adopts Orion Pax (Paxos) consistency algorithm, and the daily record of step (1) is left in N the log memory that step (3) calculates; Orion Pax algorithm is wherein to be a kind ofly issued to the consistent algorithm of multiple processors in insecure network environment, illustrating of algorithm can be referring to paper Lamport L, Malkhi D, Zhou L, Vertical Paxos and primary backup replication, MSR-TR-2009-63[R], Microsoft Research, 2009;
(5) Coordination module writes the daily record position that writes of above-mentioned steps (2) in Version Control module, and Coordination module is returned to write operation successful information to user;
(6) Coordination module is according to the key assignments storehouse line unit of step (1), and according to key assignments library backup data rule, the data that calculate step (1) are stored in M data-carrier store in key assignments storehouse, and wherein M is more than or equal to 3;
(7) data that Coordination module is submitted user in above-mentioned steps (1) to and the daily record position that writes of above-mentioned steps (2) write M data-carrier store;
The process that reads of data:
(8) user submits the request of reading out data to Coordination module, and this request comprises needs the line unit of reading out data in key assignments storehouse;
(9) Coordination module is obtained the up-to-date daily record position P corresponding with the line unit of step (8) from Version Control module 1;
(10) Coordination module, according to key assignments library backup data rule, calculates U the data-carrier store at the line unit place of step (8), and a data-carrier store S from U data-carrier store obtains the daily record position P corresponding with the line unit of step (8) 2;
(11) Coordination module is to above-mentioned two daily record position P 1and P 2compare:
If P 1=P 2, the data of Coordination module line unit of obtaining step (8) from the data-carrier store S of step (10), and the data of obtaining are returned to request user;
If P 1>P 2coordination module is according to key assignments library backup data rule, calculate V the log memory at the line unit place of step (8), and a log memory T from V log memory obtains the daily record corresponding with the line unit of step (8), these daily records comprise that daily record position is greater than P2, and are less than or equal to all daily records of P1.Meanwhile, Coordination module is according to the current log content in log memory T, the data in Update Table storer S, and from data-carrier store S, obtain and the strong corresponding data of row of step (8), these data are returned to request user.
In order to realize method of the present invention, system must be divided into following main modular:
Coordination module (Coordinator): telegon is the entrance of transactions requests access, is also the nucleus module of system logic.This module receives transactions requests from user side, according to request encapsulation affairs execution journal, by coordinate distributed storage system by transaction log backup to many machines, and notification data storer is carried out the operation of affairs.Also need at the same time the up-to-date log information of acquiring and maintaining.This module, by the independent development realization of encoding, is nucleus module of the present invention.
Version Control module (Version Controller): there is independently transaction journal of portion the unit of each affairs, the daily record of same transaction units is sorted according to the priority of carrying out.Version Control device is managed the module of the current up-to-date daily record of each transaction units position (sequence number of daily record) exactly, and correct up-to-date daily record position is the important guarantee that guarantees that affairs are carried out according to daily record number order, so this module must guarantee overall consistance.In the time realizing, the Counter of this module based on Cassandra realizes, because this function can guarantee atomicity and the consistance of data change.
Log memory (Log Node): the function of log memory is the daily record of storage Practical Operation.In order to guarantee the persistence of Practical Operation, can not cause because of the collapse of node the loss of submitting data to, in the time that executing data upgrades operation, system can first be write transaction journal, just return and submit to successfully after daily record is write as merit.According to log content, real data is modified again afterwards, also can repair according to daily record in read-write process afterwards if revised unsuccessfully, to guarantee persistence and the consistance of transaction operation.For the fault-tolerance of enhancing system under distributed environment, each daily record has multiple backups, and backup number is not less than 3 conventionally.In the time realizing, this module is realized in conjunction with the data-carrier store of Cassandra jointly by independent development.
Data-carrier store (Data Node): the function of data-carrier store is the actual data of storage, and these data are all to produce by the content of execution journal.Data-carrier store is also that the data-carrier store based on Cassandra is developed realization.In actual system is disposed, log memory and data-carrier store are usually located on same physical machine, so that aim at local execution day.

Claims (1)

1. a flexible office management method in the storage of key assignments database data, is characterized in that the method comprises the following steps:
The ablation process of data:
(1) data that write key assignments storehouse are submitted to Coordination module by user, writes in the data in key assignments storehouse with key assignments storehouse line unit, and user's the data that write key assignments storehouse and write operation are encapsulated into daily record by Coordination module;
(2) Coordination module is obtained current up-to-date daily record position from Version Control module, on up-to-date daily record position, adds 1, and what obtain step (1) daily record writes daily record position, and daily record position is described by the data line in the storage of key assignments database data;
(3) Coordination module is according to the key assignments storehouse line unit of step (1), and according to key assignments library backup data rule, N the log memory of the log store that calculates step (1) in key assignments storehouse, and wherein N is more than or equal to 3;
(4) Coordination module adopts Orion Pax consistency algorithm, and the daily record of step (1) is left in N the log memory that step (3) calculates;
(5) Coordination module writes the daily record position that writes of above-mentioned steps (2) in Version Control module, and Coordination module is returned to write operation successful information to user;
(6) Coordination module is according to the key assignments storehouse line unit of step (1), and according to key assignments library backup data rule, the data that calculate step (1) are stored in M data-carrier store in key assignments storehouse, and wherein M is more than or equal to 3;
(7) data that Coordination module is submitted user in above-mentioned steps (1) to and the daily record position that writes of above-mentioned steps (2) write M data-carrier store;
The process that reads of data:
(8) user submits the request of reading out data to Coordination module, and this request comprises needs the line unit of reading out data in key assignments storehouse;
(9) Coordination module is obtained the up-to-date daily record position P corresponding with the line unit of step (8) from Version Control module 1;
(10) Coordination module, according to key assignments library backup data rule, calculates U the data-carrier store at the line unit place of step (8), and a data-carrier store S from U data-carrier store obtains the daily record position P corresponding with the line unit of step (8) 2;
(11) Coordination module is to above-mentioned two daily record position P 1and P 2compare:
If P 1=P 2, the data of Coordination module line unit of obtaining step (8) from the data-carrier store S of step (10), and the data of obtaining are returned to request user;
If P 1>P 2coordination module is according to key assignments library backup data rule, calculate V the log memory at the line unit place of step (8), and a log memory T from V log memory obtains the daily record corresponding with the line unit of step (8), meanwhile, Coordination module is according to the current log content in log memory T, the data in Update Table storer S, and from data-carrier store S, obtain with the row of step (8) and be good for corresponding data, these data are returned to request user.
CN201210169301.8A 2012-05-28 2012-05-28 Flexible transaction management method in key-value store data storage Active CN102693312B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210169301.8A CN102693312B (en) 2012-05-28 2012-05-28 Flexible transaction management method in key-value store data storage

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210169301.8A CN102693312B (en) 2012-05-28 2012-05-28 Flexible transaction management method in key-value store data storage

Publications (2)

Publication Number Publication Date
CN102693312A CN102693312A (en) 2012-09-26
CN102693312B true CN102693312B (en) 2014-05-28

Family

ID=46858745

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210169301.8A Active CN102693312B (en) 2012-05-28 2012-05-28 Flexible transaction management method in key-value store data storage

Country Status (1)

Country Link
CN (1) CN102693312B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9449038B2 (en) * 2012-11-26 2016-09-20 Amazon Technologies, Inc. Streaming restore of a database from a backup system
CN104238963B (en) * 2014-09-30 2017-08-11 华为技术有限公司 A kind of date storage method, storage device and storage system
CN105704004B (en) * 2014-11-28 2019-10-22 华为技术有限公司 Business data processing method and device
CN106708840A (en) * 2015-11-12 2017-05-24 中国科学院深圳先进技术研究院 Customer information management method and system
CN109522273B (en) * 2018-11-15 2022-02-18 郑州云海信息技术有限公司 Method and device for realizing data writing
CN109739684B (en) * 2018-11-20 2020-03-13 清华大学 Vector clock-based copy repair method and device for distributed key value database
CN113778632A (en) * 2021-09-14 2021-12-10 杭州沃趣科技股份有限公司 Distributed transaction management method based on cassandra database

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101706811B (en) * 2009-11-24 2012-01-25 中国科学院软件研究所 Transaction commit method of distributed database system
US8244686B2 (en) * 2009-12-04 2012-08-14 International Business Machines Corporation High throughput, reliable replication of transformed data in information systems
CN102298641B (en) * 2011-09-14 2013-05-01 清华大学 Method for uniformly storing files and structured data based on key value bank

Also Published As

Publication number Publication date
CN102693312A (en) 2012-09-26

Similar Documents

Publication Publication Date Title
CN102693312B (en) Flexible transaction management method in key-value store data storage
AU2017218964B2 (en) Cloud-based distributed persistence and cache data model
CN103312791B (en) Internet of Things isomeric data storage means and system
CN102663117B (en) OLAP (On Line Analytical Processing) inquiry processing method facing database and Hadoop mixing platform
US11841844B2 (en) Index update pipeline
Tsai et al. Scalable architectures for SaaS
US11314717B1 (en) Scalable architecture for propagating updates to replicated data
CN103106286B (en) Method and device for managing metadata
CN103929500A (en) Method for data fragmentation of distributed storage system
CN100452046C (en) Storage method and system for mass file
JP2016524750A5 (en)
CN103218175A (en) Multi-tenant cloud storage platform access control system
US9684686B1 (en) Database system recovery using non-volatile system memory
US11250022B1 (en) Offline index builds for database tables
US11003550B2 (en) Methods and systems of operating a database management system DBMS in a strong consistency mode
CN110083306A (en) A kind of distributed objects storage system and storage method
Arrieta-Salinas et al. Classic replication techniques on the cloud
US11886508B2 (en) Adaptive tiering for database data of a replica group
US10970177B2 (en) Methods and systems of managing consistency and availability tradeoffs in a real-time operational DBMS
Pankowski Consistency and availability of Data in replicated NoSQL databases
US11789971B1 (en) Adding replicas to a multi-leader replica group for a data set
CN114817402A (en) SQL execution optimization method of distributed database in multi-region deployment scene
US11880385B1 (en) Ordering updates to secondary indexes using conditional operations
Cheng et al. BF-matrix: A secondary index for the cloud storage
Tian et al. Overview of Storage Architecture and Strategy of HDFS

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant