CN102402616B

CN102402616B - Method and system for realizing database cluster

Info

Publication number: CN102402616B
Application number: CN201110436133.XA
Authority: CN
Inventors: 吴炳锡; 高磊; 赵博然
Original assignee: Beijing Feinno Communication Technology Co Ltd
Current assignee: Beijing Feinno Communication Technology Co Ltd
Priority date: 2011-12-22
Filing date: 2011-12-22
Publication date: 2015-01-14
Anticipated expiration: 2031-12-22
Also published as: CN102402616A

Abstract

The invention discloses a method and a system for realizing a database cluster, which can avoid wasting of resources and reduce capacity expansion cost. The method for realizing the database cluster provided by the embodiment of the invention comprises the following steps of: setting partition keys for each user, and dividing user data which is supported on each database node by using the partition keys; and according to the partition keys of a new user, judging whether the user belongs to the existing database nodes in the database cluster or not, if so, storing the user data of the user on the database nodes to which the user belongs, otherwise, adding new database nodes into the database cluster, and storing the user data of the user on the new database nodes.

Description

A kind of implementation method of data-base cluster and system

Technical field

The present invention relates to Internet technical field, particularly a kind of implementation method of data-base cluster and system.

Background technology

Along with the development of Internet technology, the type of service in internet gets more and more, and the quantity of Internet user is also increasing.Usually need to be equipped with corresponding data-base cluster to the business in internet, the business datum needed for storage, ensure the normal operation of business.

According to the difference of business, the scale of the data-base cluster be equipped with is also different.General process in the following way in this respect at present: the scale just determining built data-base cluster when initial, as planned by 100,000,000 grades of users, the database node of a greater number is set, or, plan by little customer volume, a database node is only set.Then, by selected distribution mechanism, the data of different user are assigned in different databases, and the access rule of each database node of access is set according to built data-base cluster.

At least there is following defect in the implementation method of above-mentioned existing data-base cluster:

For an emerging business, be often difficult to assessment for the customer volume of business and user behavior, along with the development of business, the hardware resource of initial plan is difficult to reach a reasonably configuration.

Because the node in existing implementation data-base cluster needs to fix in advance, user's distribution mechanism in cluster and node visit rule are also all only applicable to initial determined node, if then early stage is larger to plan of operation, later stage practical business does not reach corresponding degree, certainly will cause resource serious waste; If early stage is less to plan of operation, resource prepares not enough, and business goes from strength to strength, along with the increase service feature of user is constantly decayed, later stage often in the face of the dilatation problem of database, will need constantly to change the user's distribution mechanism in data-base cluster and node visit rule, operates too loaded down with trivial details, workload is comparatively large, and dilatation cost is high.

Summary of the invention

The implementation method of the data-base cluster that the embodiment of the present invention provides and system, with resource serious waste in the implementation solving existing database cluster and dilatation complex operation, problem that dilatation cost is high.

For achieving the above object, the embodiment of the present invention have employed following technical scheme:

Embodiments provide a kind of implementation method of data-base cluster, the method comprises:

For each user arranges subregion key, and described subregion key is utilized to divide the user data that each database node is supported;

To newly-increased user, judge whether this user belongs to existing database node in data-base cluster according to the subregion key of this user, if, by the storage of subscriber data of this user on the database node belonging to this user, if not, new database node is added, by the storage of subscriber data of this user on new database node in data-base cluster.

The embodiment of the present invention additionally provides a kind of data-base cluster and realizes system, and described system comprises the database broker device be connected with database node each in data-base cluster,

Described database broker device comprises:

Subregion setting unit, for arranging subregion key for each user, and utilizes described subregion key to divide the user data that each database node is supported;

Dilatation storage unit, for to newly-increased user, judge whether this user belongs to existing database node in data-base cluster according to the subregion key of this user, if, by the storage of subscriber data of this user on the database node belonging to this user, if not, in data-base cluster, add new database node, by the storage of subscriber data of this user on new database node.

The beneficial effect of the embodiment of the present invention is:

The embodiment of the present invention passes through the setting of subregion key and zoning schemes, a database Agent layer is provided with between database and server, achieve a kind of dynamic data-base cluster building plan, can according to the growth of the development of business and customer volume, carry out the automatic dilatation of data-base cluster, avoid plan of operation in early stage comparatively large, the wasting of resources caused.

Further, the dynamic capacity-expanding of the technical program mechanism, when carrying out the dilatation of data-base cluster, without the need to rewriteeing user's distribution mechanism in cluster and node visit rule, significantly reduces workload and dilatation cost.

Accompanying drawing explanation

The implementation method process flow diagram of a kind of data-base cluster that Fig. 1 provides for the embodiment of the present invention one;

The method flow diagram of a kind of accessing database node that Fig. 2 provides for the embodiment of the present invention one;

The method flow diagram of the accessing database node that Fig. 3 provides for the embodiment of the present invention two;

A kind of data-base cluster that Fig. 4 provides for the embodiment of the present invention three realizes system architecture schematic diagram.

Embodiment

For making the object, technical solutions and advantages of the present invention clearly, below in conjunction with accompanying drawing, embodiment of the present invention is described further in detail.

See Fig. 1, be the implementation method of a kind of data-base cluster that the embodiment of the present invention one provides, the method comprises:

11: for each user arranges subregion key (Partition Key, PTK), and utilize described subregion key to divide the user data that each database node is supported.

Subregion key is used to ensure the information of data (as user data) in group system Distribution and localization.By carrying out corresponding computing to the numerical value of subregion key thus determining corresponding user data is on which database node.

Preferably, in the present embodiment, subregion key adopts from increasing type Numerical Implementation, and at this moment, subregion key adopts the form of integer.Such as, according to the sequencing that user registers in data-base cluster, for user arranges a mark (ID) uniquely of overall importance, using the subregion key of this ID as user, the subregion key of first user then registered in the cluster can as 1, the subregion key of second user registered can as 10000 as the subregion key of the 2, the 10,000 user registered.Be appreciated that this subregion key also can adopt the information realization of other unique identification users, such as, also can by the user ID (UserId) of user's uniqueness in the cluster as subregion key.

Existed in cluster or follow-up interpolation the certain subregion key range of each database node support or interval in the user data of user, according to concrete zoning schemes, the maximum number of user amount that each database node is supported can be identical, also can be different.

When the maximum number of user amount that each database node is supported is identical, based on following formula, described subregion key is utilized to divide the user data that each database node is supported:

N/M＝Nodenum

Wherein, N represents the numerical value of the subregion key of user, the number of users that M supports for each database node, and Nodenum is the numbering of the database node storing user data.

12: to newly-increased user, judge whether this user belongs to existing database node in data-base cluster according to the subregion key of this user.If so, perform step 13, if not, perform step 14.

Concrete, the numbering of database node corresponding to this user is calculated according to the subregion key of this user, when the numbering of the database node calculated is with when the numbering of data with existing storehouse node is identical in data-base cluster, confirm that this user belongs to existing database node in data-base cluster; When the numbering of the database node calculated is greater than the maximum numbering in data-base cluster, confirm that this user does not belong to existing database node in data-base cluster.

13: by the storage of subscriber data of this user on the database node belonging to this user.

14: in data-base cluster, add new database node, by the storage of subscriber data of this user on new database node.

The technical scheme of the present embodiment is described with a concrete example below.In this example, time initial, in data-base cluster, be provided with a database node, and in cluster, the most multipotency of database node is supported to adopt 1,000,000 registered users and successively increase progressively the numerical value of 1 as subregion key from 1.

Then subregion key is that the user data of 1-100 ten thousand registered user can store in the cluster on existing database node (first database node), when the quantity of registered user is soon close to 1,000,000 or when reaching 1,000,000, need to add in the cluster a new node, can be routed automatically to when the subregion key of user is greater than 100W on new node (second database node).When the quantity of registered user is soon close to 2,000,000 or when reaching 2,000,000, needs add a new node in the cluster again, can be routed automatically on this new node (the 3rd database node) when the subregion key of user is greater than 200W, along with the increase of number of users, continuous repetition aforesaid operations, the automatic dilatation of fulfillment database cluster.

From the above mentioned, can ensure after new node adds cluster by above-mentioned zoning schemes, can according to a point area definition configuration, if only have a node when just starting and each node support 1,000,000 registered users, then when the ID that new user registers distribution is more than or equal to 1,000,000, namely new data can be write on newly-increased node, thus complete the automatic dilatation of cluster.

Further, the zoning schemes of this programme additionally provides a kind of access rule to database node, see Fig. 2, specifically comprises:

21: resolving receiving request of access bag, obtaining described subregion key, wherein, need to comprise subregion key from the request of access bag of requesting party.

According to predetermined protocol (as MySQL agreement), decapsulation is carried out to the request of access bag received, extracts access statement.This access statement is carried out to the analysis of morphology, judge the whether grammaticalness requirement of described access statement, if so, from access statement, obtain subregion key, if not, refuse this accessing operation.

22: determine corresponding database node according to obtained subregion key.

The numbering of the database node corresponding to calculating according to obtained subregion key, utilize the numbering of this database node to determine corresponding database node, the user being about to have respective partition key is matched to determined database node.Still illustrate with above-mentioned example, matched rule can be expressed as follows:

The node of N/100 ten thousand=hit

Wherein, N is the subregion key of user.

When the subregion key (or ID) of user is less than 1,000,000, subregion key obtains 0 (numbering of first node is from 0) this value divided by 1,000,000, so 1 all can be routed to first database node to the operation in (1,000,000-1) this interval, the meeting namely subregion key being less than to the user of 1,000,000 confirms that it belongs to first database node.See table 1, show the corresponding relation between database node and subregion keypad.

Table 1

1 to (1,000,000-1)	1000000 to (2,000,000-1)	2000000 to (3,000,000-1)	...
				DB0	DB1	DB2

Wherein, DB0 represents first database node, and DB1 represents second database node, and DB2 represents the 3rd database node.

23: request of access bag is sent on determined database node, and receive the access result returned from this database node.

24: described access result is forwarded to requesting party.

Result will be accessed according to after predetermined protocol (as MySQL agreement) encapsulation, be sent to requesting party.

From the above mentioned, the embodiment of the present invention passes through the setting of subregion key and zoning schemes, a database Agent layer is provided with between database and server, achieve a kind of dynamic data-base cluster building plan, can according to the growth of the development of business and customer volume, carry out the automatic dilatation of data-base cluster, avoid plan of operation in early stage comparatively large, the wasting of resources caused.

The embodiment of the present invention two is described the concrete operations of database node in access cluster, and see Fig. 3, the processing logic of this programme can be divided into protocol analysis layer, data partition definition layer, data operation layer and data-base cluster layer four level.

31: protocol analysis layer parses subregion key and is sent to data partition definition layer.

Above-mentioned agreement can adopt MySQL agreement, encapsulating input and output (IO) model, as sealed a simple network I/O model based on epoll, realizing the certification etc. based on MySQL at protocol analysis layer.Inquiry (query) for routine is asked, and can parse subregion key, be then given to data partition definition layer through this layer.

The function of this layer emphasis is: the parsing of Structured Query Language (SQL) (SQL) and partition.In the agreement of MySQL, mainly process for the query bag in order (command) bag, after reading corresponding access statement, carry out the analysis of morphology, then do a small amount of grammatical analysis, confirm that the grammer of access statement is correct.Then, obtain corresponding PTK and carry out follow-up operation.

32: data partition definition layer determines corresponding database node, and request of access bag is sent to the data operation layer of rear end.

Data partition definition layer is mainly used to safeguard the allocation list of subregion key to back-end data base node corresponding relation.After receiving new request of access, carry out computing for subregion key, obtain corresponding node.This request is sent to Back end data operation layer simultaneously.

33: request of access is sent to database node by data operation layer, and access result is back to requesting party.

The Main Function of data operation layer comprises: keep the connection of database node in back-end data base cluster and safeguard, to the process of the request of access of front end, the access result that data-base cluster generates is carried out MySQL protocol encapsulation and is returned to requesting party.This layer, when the request of receiving front end, according to the determined node of data partition definition layer, correctly can be dealt into request of access on corresponding database node and operates accordingly.

Wherein, in data-base cluster layer, multiple master/slave (master/slave) group utilizing the simultaneous techniques of MySQL to build, each group is equivalent to a database node in cluster.Point master, slave role in a same group, the database for master role can bear read-write operation, and the database for slave role only serves as read operation, and after master database breaks down, takes over some Processing tasks.

From the above mentioned, the Agent layer of an accessing database cluster is set in this programme, accessing operation is accessed through Agent layer and forwards on the database node of rear end reality, realize the access of database node and returning of access result.

The embodiment of the present invention three additionally provides a kind of data-base cluster and realizes system, and see Fig. 4, described system comprises the database broker device 42 be connected with database node 41 each in data-base cluster,

Described database broker device 42 comprises:

Subregion setting unit 421, for arranging subregion key for each user, and utilizes described subregion key to divide the user data that each database node is supported;

Dilatation storage unit 422, for to newly-increased user, judge whether this user belongs to existing database node in data-base cluster according to the subregion key of this user, if, by the storage of subscriber data of this user on the database node belonging to this user, if not, in data-base cluster, add new database node, by the storage of subscriber data of this user on new database node.

Further, comprise subregion key from the request of access bag of requesting party, described database broker device also comprises access control unit 423, for resolving receiving request of access bag, obtain described subregion key, determine corresponding database node according to obtained subregion key; Request of access bag is sent on determined database node, receives the access result returned from this database node, and described access result is forwarded to requesting party.

In present system embodiment, the concrete working method of each unit and device can see embodiment of the method for the present invention.

The foregoing is only preferred embodiment of the present invention, be not intended to limit protection scope of the present invention.All any amendments done within the spirit and principles in the present invention, equivalent replacement, improvement etc., be all included in protection scope of the present invention.

Claims

1. an implementation method for data-base cluster, is characterized in that, described method comprises:

For each user arranges subregion key, and utilize described subregion key to divide the user data that each database node is supported, wherein, according to the sequencing that user registers in data-base cluster, adopt the subregion key forming each user described from increasing type numerical value, existed in described cluster or follow-up interpolation the certain subregion key range of each database node support in the user data of user, the maximum number of user amount that each database node is supported is identical or different;

To newly-increased user, the numbering of database node corresponding to this user is calculated according to the subregion key of this user, when the numbering of the database node calculated is with when in data-base cluster, the numbering of data with existing storehouse node is identical, confirm that this user belongs to existing database node in data-base cluster, by the storage of subscriber data of this user on the database node belonging to this user; When the numbering of the database node calculated is greater than the maximum numbering in data-base cluster, confirm that this user does not belong to existing database node in data-base cluster, new database node is added, by the storage of subscriber data of this user on new database node in data-base cluster;

Wherein, comprise subregion key from the request of access bag of requesting party, described method also comprises:

Resolving receiving request of access bag, obtaining described subregion key;

Corresponding database node is determined according to obtained subregion key;

Request of access bag is sent on determined database node, and receives the access result returned from this database node;

Described access result is forwarded to requesting party;

Wherein, describedly determine that corresponding database node comprises according to obtained subregion key:

The numbering of the database node corresponding to calculating according to obtained subregion key, utilizes the numbering of this database node to determine corresponding database node.

2. method according to claim 1, is characterized in that, based on following formula, utilizes described subregion key to divide the user data that each database node is supported:

N/M=Nodenum

3. method according to claim 1, is characterized in that, describedly resolves receiving request of access bag, obtains described subregion key and comprises:

According to predetermined protocol, decapsulation is carried out to the request of access bag received, extracts access statement;

Judge the whether grammaticalness requirement of described access statement, if so, from access statement, obtain subregion key, if not, refuse this accessing operation.

4. method according to claim 3, is characterized in that, describedly described access result is forwarded to requesting party comprises:

Result will be accessed according to after predetermined protocol encapsulation, be sent to requesting party.

5. data-base cluster realizes a system, it is characterized in that, described system comprises:

Subregion setting unit, for arranging subregion key for each user, and utilize described subregion key to divide the user data that each database node is supported, wherein, according to the sequencing that user registers in data-base cluster, adopt the subregion key forming each user described from increasing type numerical value, existed in described cluster or follow-up interpolation the certain subregion key range of each database node support in the user data of user, the maximum number of user amount that each database node is supported is identical or different;

Dilatation storage unit, for to newly-increased user, the numbering of database node corresponding to this user is calculated according to the subregion key of this user, when the numbering of the database node calculated is with when in data-base cluster, the numbering of data with existing storehouse node is identical, confirm that this user belongs to existing database node in data-base cluster, by the storage of subscriber data of this user on the database node belonging to this user; When the numbering of the database node calculated is greater than the maximum numbering in data-base cluster, confirm that this user does not belong to existing database node in data-base cluster, new database node is added, by the storage of subscriber data of this user on new database node in data-base cluster;

Wherein, comprise subregion key from the request of access bag of requesting party, described system also comprises access control unit,

Described access control unit, for resolving receiving request of access bag, obtaining described subregion key, determining corresponding database node according to obtained subregion key; Request of access bag is sent on determined database node, receives the access result returned from this database node, and described access result is forwarded to requesting party;

Described access control unit, also specifically for the numbering of the database node corresponding to the calculating of obtained subregion key, utilizes the numbering of this database node to determine corresponding database node.