CN102968498A

CN102968498A - Method and device for processing data

Info

Publication number: CN102968498A
Application number: CN 201210516613
Authority: CN
Inventors: 张巍; 雷晓松
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2012-12-05
Filing date: 2012-12-05
Publication date: 2013-03-13
Anticipated expiration: 2032-12-05
Also published as: CN102968498B

Abstract

The embodiment of the invention relates to a method and a device for processing data. The method comprises the following steps: acquiring the data to be stored and a data identifier of the data to be stored; calculating according to the data identifier to acquire a first partition used for storing the data to be stored, and acquiring a first node to which the first partition belongs; respectively storing the data to be stored and the data identifier thereof to the first node, and recording the store address of the data; using the data identifier of the data to be stored, an identifier of the first partition and the store address of the data to generate index information, and adding the index information into an index zone of the first node. The embodiment of the method does not need to scan a hard disk overall, decreases the times of reading the hard disk when migrates the partition, and increases the data read efficiency.

Description

Data processing method and device

Technical field

The present invention relates to the memory system technologies field, relate in particular to a kind of data processing method and device.

Background technology

The cloud storage, similar with cloud computing, it refers to by functions such as cluster application, gridding technique or distributed file systems, a large amount of various dissimilar memory devices in the network are gathered collaborative work by application software, a system of data storage and Operational Visit function externally is provided jointly.

Distributed hashtable (Distributed Hash Table is adopted in the cloud storage, DHT) to become distributed file system (be the distributed storage cluster to technology groups, hereinafter to be referred as cluster), each memory node has been assigned with independently a plurality of subregions (Partition) according to consistance Hash (hash) algorithm.Because the cloud storage adopts cheap low reliable hardware to form the Mass storage pond, the storage hardware fault is normality; Simultaneously for satisfying the elastic supply of resource, memory node often occurs and dynamically add or leave cluster.When node breaks down or carry out dynamic capacity-expanding, capacity reducing, the Partition sequence can redistribute at node, the Partition scope of managing on the node can change, carry out again the migration operation of load balancing (rebalance), some Partition need to be moved to other node, the Partition that malfunctioning node is born will be born by other nodes, and newly-increased node is born the Partition that a part is born by other nodes, thereby guarantee the load balancing between each memory node.

Existing data file layout adopts open source software Tokyo Cabinet HDB/BDB/FDB(hash database/multipath tree database/fixed-length record database usually) organizational form.The data of memory node storage comprise a series of key-value(key-value) right, in the bucket array (Bucket Array) of memory node, deposit successively the key-value chained list, corresponding to the bucket sign bucket ID of order, and key-value disperses the memory address on the storage medium of memory node (such as hard disk etc.).That is to say, for different K ey-value Laden Balance corresponding to Key-value corresponding to different bucket ID and same BucketID on hard disk.

Thereby memory node need to carry out scan full hard disk when migration data, and key-value of every scanning according to the partition information of storing among the value that scans, compares with the partition of the needs of appointment migration, then moves as identical.Such mode is very serious to the read-write of hard disk waste, really needs the Partition of migration may be less than 5% of overall capacity, but will carry out scan full hard disk to hard disk.And, when Data Migration, only mating and transmission migration action by serial key-value, I/O number of hard disk is too much, inefficiency.

Summary of the invention

In view of this, the purpose of this invention is to provide a kind of data processing method and device, do not need hard disk is carried out scan full hard disk, when reducing the Partition migration to the number of times of hard disk read operation, improve data reading performance using redundancy, can concurrent transmission key-value when Data Migration, improve the utilization factor of bandwidth between node, improve the data reading performance.

For achieving the above object, embodiment of the invention first aspect provides a kind of data processing method, and described method comprises:

Obtain the Data Identification of data to be stored and described data to be stored;

Calculate the first subregion that described data to be stored will be stored according to described Data Identification, and obtain the first node under described the first subregion;

Described Data Identification and the described storage data for the treatment of of described data to be stored are stored in respectively described first node, and the record data memory address;

With the Data Identification of described data to be stored, the first subregion sign and the address data memory generating indexes information at place, and this index information is added in the index area of described first node.

In conjunction with first aspect, in the possible implementation of the first of first aspect, described index area comprises at least one subindex district, according to the described Data Identification in the index information of described generation being carried out the result of Hash calculation or determining the subindex district that the index information of described generation will be deposited according to the size order of Data Identification, the index information of described generation is deposited in described definite described subindex district.

In conjunction with the possible implementation of the first of first aspect or first aspect, in the possible implementation of the second, with the Data Identification of described data to be stored, the first subregion sign and the address data memory generating indexes information at place, and this index information is added in the index area of described first node, comprising:

The Data Identification of existing index information in the index area of the Data Identification in the index information of described generation and described first node is compared, put in order according to predefined, determine the memory location of Data Identification in described existing index information in the index information of described generation, add the index information of described generation to described memory location.

In conjunction with the possible implementation of the first of first aspect or first aspect, in the third possible implementation, after the first node under obtaining described the first subregion, also comprise:

Judge in the index area of described first node and whether have the index information identical with the Data Identification of described data to be stored, when not having the identical index information of described Data Identification in the described index area, then carry out the step that Data Identification and described data to be stored with described data to be stored are stored in respectively described first node; When having the identical index information of described Data Identification in the described index area, then do not carry out the step that Data Identification and described data to be stored with described data to be stored are stored in respectively described first node.

In conjunction with the possible implementation of the first of first aspect or first aspect, in the 4th kind of possible implementation of first aspect, also comprise:

Obtain the Data Identification of the data to be checked of input;

Calculate second subregion at described data to be checked place according to the Data Identification of described data to be checked, and obtain the Section Point under described the second subregion;

The index information that the Data Identification of inquiry and described data to be checked is complementary from the index area of described Section Point, the index information in the described index area comprises the Data Identification of storing data, subregion and the address data memory at place;

According to the address data memory in the index information of the described data to be checked that match, from described Section Point, read described data to be checked.

In conjunction with the 4th kind of possible implementation of first aspect, in the 5th kind of possible implementation of first aspect, before from described Section Point, reading described data to be checked, also comprise:

Described address data memory to a plurality of described data to be checked sorts, and reads described data to be checked according to ranking results order from described Section Point corresponding to described data to be checked.

In conjunction with the possible implementation of the first of first aspect or first aspect, in the 6th kind of possible implementation of first aspect, described method also comprises:

When satisfying the zoned migration condition that presets, obtain subregion to be migrated;

Obtain the 3rd affiliated node of described subregion to be migrated;

Coupling obtains all index informations identical with described subregion sign to be migrated from the index area of described the 3rd node, and the index information in the index area of described the 3rd node comprises the Data Identification of storing data, subregion and the address data memory at place;

The address data memory of the described index information that coupling is obtained sorts, and ranking results is sent to described the 3rd node, reads described data to be migrated and migrates to destination node in order to described the 3rd node sequence.

Second aspect, the embodiment of the invention also provide a kind of data processing equipment, and described device comprises:

Acquiring unit is for the Data Identification that obtains data to be stored and described data to be stored;

Computing unit, the Data Identification that is used for obtaining according to described acquiring unit calculates the first subregion that described data to be stored will be stored, and obtains the first node under described the first subregion;

Storage unit is used for Data Identification and the described data to be stored of described data to be stored are stored in respectively described first node, and the record data memory address;

Indexing units, be used for the first subregion sign at the Data Identification of data to be stored that described acquiring unit is obtained, place that described computing unit is determined and the address data memory generating indexes information of described unit records, and this index information added in the index area of the described first node that described computing unit determines.

In conjunction with second aspect, in the possible implementation of the first of second aspect, described index area comprises at least one subindex district, according to the described Data Identification in the index information of described generation being carried out the result of Hash calculation or determining the subindex district that the index information of described generation will be deposited according to the size order of Data Identification, the index information of described generation is deposited in described definite described subindex district.

In conjunction with the possible implementation of the first of second aspect or second aspect, in the possible implementation of the second of second aspect, described indexing units compares the Data Identification of existing index information in the index area of the Data Identification in the index information of described generation and described first node, put in order according to predefined, determine the memory location in described existing index information of the Data Identification in the index information of described generation, add the index information of described generation to described memory location.

In conjunction with the possible implementation of the first of second aspect or second aspect, in the third possible implementation of second aspect, described device also comprises:

Go to heavy unit, whether there be the index information identical with the Data Identification of described data to be stored for the index area of judging the described first node that described computing unit obtains, when not having the identical index information of described Data Identification in the described index area, trigger described storage unit; When not having the identical index information of described Data Identification in the described index area, then do not trigger described storage unit.

In conjunction with the possible implementation of the first of second aspect or second aspect, in the 4th kind of possible implementation of second aspect, described acquiring unit also is used for obtaining the Data Identification of the data to be checked of input;

Described computing unit also calculates second subregion at described data to be checked place for the Data Identification of the data to be checked of obtaining according to described acquiring unit, and obtains the Section Point under described the second subregion;

Described device also comprises:

Matching unit is used for the index information that is complementary from the Data Identification of the index area inquiry of described Section Point and described data query, and the index information in the described index area comprises the Data Identification of storing data, subregion and the address data memory at place;

Reading unit is used for the address data memory of the index information of the described data to be checked that obtain according to described matching unit, reads described data to be checked from described Section Point.

In conjunction with the 4th kind of possible implementation of second aspect, in the 5th kind of possible implementation of second aspect, described device also comprises:

Sequencing unit, be used for to described matching unit for a plurality of described Data Matching to be checked to described address data memory sort; Described reading unit reads described data to be checked according to ranking results order from described Section Point corresponding to described data to be checked of described sequencing unit.

In conjunction with the possible implementation of the first of second aspect or second aspect, in the 6th kind of possible implementation of second aspect, described acquiring unit also is used for obtaining subregion to be migrated when satisfying the zoned migration condition that presets;

Described computing unit also is used for obtaining the 3rd affiliated node of described subregion to be migrated;

Described device also comprises:

Matching unit is used for obtaining all index informations identical with described subregion sign to be migrated from the index area coupling of described the 3rd node, and the index information in the described index area comprises the Data Identification of storing data, subregion and the address data memory at place;

Sequencing unit sorts for the address data memory that described matching unit is mated the described index information that obtains, and ranking results is sent to described the 3rd node, reads described data to be migrated and migrates to destination node in order to described the 3rd node sequence.

The data processing method that the embodiment of the invention provides and device, key and value divided to open deposit, in the index area of data, record key, partition and the value address offset on hard disk, when Data Migration, only need the scanning index district, can find out the key and the value address that need migration, do not need hard disk is carried out scan full hard disk, can reduce Partition when migration to the number of times of hard disk read operation, improve data reading performance using redundancy, can concurrent transmission key-value when Data Migration, improve the utilization factor of bandwidth between node, improve the data reading performance.

Description of drawings

A kind of data processing method process flow diagram that Fig. 1 provides for the embodiment of the invention;

The data store content synoptic diagram of a kind of index area that Fig. 2 provides for the embodiment of the invention;

The data file layout synoptic diagram of a kind of index area that Fig. 3 provides for the embodiment of the invention;

The method flow diagram of a kind of data query step that Fig. 4 provides for the embodiment of the invention;

The method flow diagram of a kind of Data Migration step that Fig. 5 provides for the embodiment of the invention;

The synoptic diagram of a kind of data processing equipment that Fig. 6 provides for the embodiment of the invention;

The synoptic diagram of a kind of data-storage system that Fig. 7 provides for the embodiment of the invention;

The synoptic diagram of a kind of storage administration node that Fig. 8 provides for the embodiment of the invention;

Fig. 9 is the synoptic diagram of the storer Program of storage administration node shown in Figure 8;

The synoptic diagram of a kind of data memory node that Figure 10 provides for the embodiment of the invention;

Figure 11 is the synoptic diagram of the storer Program of data memory node shown in Figure 10.

Embodiment

Below by drawings and Examples, the technical scheme of the embodiment of the invention is described in further detail.

The distributed storage cluster adopts key-value key-value structured data storage system usually in the prior art, the storage data adopt Key-value usually, and right form represents, when Key is data access in whole cluster unique sign, value is the data of access itself.

The cloud storage is for robustness, the harmony of assurance system self, and the application of Data Migration is very widely in its inner distributed storage cluster.Wherein the most substantially, comprise two kinds: the one, the data redundancy back-up processing is in order to fault-tolerance, the data reliability that guarantees cluster self; The 2nd, because cluster is comprised of numerous dynamic nodes, the node machine of may at a time delaying is suddenly namely arranged, the node that has then might at a time add cluster again, system is in order to guarantee the harmony of global storage, can be automatically or manual triggers make the order of each internodal storage utilization rate balance, need thus the part partition of existing node is moved in the newly-increased node, perhaps move on other nodes after the partition equilibrium with malfunctioning node.

To be system be divided into different subregions according to consistance Hash (hash) algorithm with the storage space of memory node to Partition, and each memory node has a plurality of independently Partition.For example, 10 memory nodes in the system, need to bear 100 Partition, corresponding subregion is designated Partition 0～Partition 99, these 100 Partition are assigned to 10 nodes, for example, that distributes to first node comprises Partition 0～Partition 9, that distributes to second node comprises Partition 10～Partition 19, that distributes to the 3rd node comprises Partition 20～Partition 29, the like, that distributes to the tenth node comprises Partition 90～Partition99, certainly can not distribute in order in actual applications, for example distribute according to the mode of delivery (mould 10).

When a newly-increased node, system need to be from each node balanced a part of partition to this new node.For example at this moment, that distributes to first node comprises Partition 0～Partition 9, that distributes to second node comprises Partition 10～Partition 18, that distributes to the 3rd node comprises Partition 19～Partition 27, the like, that distributes to the tenth node comprises Partition 82～Partition 90, and that distributes to the 11 node comprises Partition 91～Partition 99.That is to say, the partition that needing can obtain migration calculates in system by hash, for example, partition 19 need to be moved on the 3rd node from second node, partition 28, partition 29 need to be moved on the 4th node from the 3rd node, the like, partition91～Partition 99 need to be moved on the 11 node from the tenth node.Certainly, if according to the mode of delivery (mould 11), need the partition of migration can be different.

The data processing method that the embodiment of the invention provides can be carried out by arbitrary controller that receives the memory node of data processing task in the key-value storage system, also can be owing to third party's processor of memory node is carried out.Fig. 1 is the data processing method process flow diagram that present embodiment provides, and as shown in Figure 1, the data processing method of the embodiment of the invention comprises:

Step S101, obtain the Data Identification of data to be stored and described data to be stored.

Data to be stored adopt the form of key (Key)-value (value) to represent usually, when Key is data access in whole cluster unique sign, value is the data of access itself.

To store a certain microblogging data instance, the key of the data to be stored that system generates adopts string representation, may comprise time, user profile and serial number, represent when certain user has sent out a certain microblogging, value then is concrete microblogging content, for example " have a meal at * * * today ".

For the data to be stored that adopt the key-value form to represent, this step then is to obtain these data to be stored itself (being value) and Data Identification key.

Step S102, calculate the first subregion that described data to be stored will be stored according to described Data Identification, and obtain the first node under described the first subregion.

Obtain the eigenwert of described key, described eigenwert is used for the described key of unique expression, the method of obtaining the eigenwert of key can be that described key is carried out Hash calculation, obtain the cryptographic hash of described key, with the cryptographic hash that the obtains eigenwert as described Key, obtain the Partition at described data to be stored place according to the key eigenwert.Particularly, the value of described key is carried out modulo operation, determine the partition at place.

Then, again the Partition sign at data to be stored place is carried out the consistance Hash calculation, to determine corresponding node according to partition.If the Partition at the data to be stored place that calculates is designated 70,10 nodes are arranged in the distributed type assemblies, carry out delivery (mould 10) computing after, can determine that these data to be stored should be on first node.

Certainly, during node under obtaining partition, also can inquire about by the partition segment information table of system cluster configuration and obtain node corresponding to partition.Described partition segment information table can be pre-configured, also can carry out according to the storage condition of the cluster of reality real-time update.

Wherein, after the node under obtaining partition, also comprise: judge the step that whether has the index information identical with the Data Identification of described data to be stored in the index area of described first node, when not having the identical index information of described Data Identification key in the described index area, execution in step S103 stores; When having the identical index information of described key in the described index area, then do not store, illustrate and stored described data to be stored on this memory node.

Step S103, described Data Identification and the described storage data for the treatment of of described data to be stored is stored in respectively described first node, and the record data memory address.

The key of data to be stored and value are deposited in respectively on the storage medium of the described first node that step S102 obtains.

Address data memory is the address offset amounts of described data to be stored on the storage medium (such as hard disk) of described node, is the value address, and the address offset amount of expression value on the hard disk of this node is as representing with the LBA address.

Step S104, with the first subregion sign and the address data memory generating indexes information at the Data Identification of described data to be stored, place, and this index information is added in the index area of described first node.

Index information in the index area of memory node is included in the index information of each storage data of having stored on this memory node, each bar index information comprises: the Data Identification of having stored data, stored the first subregion sign at data place and stored the address data memory of data, namely comprised: stored the key of data, partition and the value address information at place.

The index area can comprise a plurality of subindexs district, and the index information of described generation is deposited in the described subindex district.

Fig. 2 is the data store content synoptic diagram of the index area that provides of the embodiment of the invention, as shown in Figure 2, this index area comprises m BucketID, in Bucket1, comprise N bar index information, key11 represents the key of article one index information among the Bucket1, partition_K11 represents the partition at key11 place, and value_LBA_k11 represents the value address, the address offset amount of the value that namely key11 is corresponding on hard disk.When data query, only need in the index area, mate key and can find corresponding storage data, when migration data, can calculate the partition at this key place according to key, partition corresponding to coupling in the index area, namely can find with partition to be migrated on corresponding storage data.

The index information of described generation deposited in the described subindex district comprise: determine the sign in the subindex district that the index information of described generation will be deposited according to the result who the Data Identification in the index information of described generation is carried out Hash calculation or according to the size order of Data Identification, the index information of described generation is deposited in described definite described subindex district.

When newly-generated index information is arranged, the key of existing index information in the index area of the key in the index information of described generation and described first node is compared, put in order according to predefined, determine the memory location of key in described existing index information in the described index information of described generation, add the index information of described generation to described memory location.

Perhaps, when newly-generated index information is arranged, also can carry out Hash calculation to the key in the index information of described generation, the bucket ID that will deposit with the described index information of determining described generation, for example, can carry out delivery (mould m) to key and calculate, obtain the value of bucket ID, the described index information of this generation is deposited among the described bucket ID.

Wherein, the data processing method of the embodiment of the invention can also comprise the step of data query after finishing data storage step to be stored, in order to data are inquired about or reading out data to storing, thereby, the step of executing data inquiry on embodiment basis shown in Figure 1.Fig. 3 is the method flow diagram of the data query step that provides of present embodiment, and as shown in Figure 3, the step of described data query comprises:

Step S201, obtain the Data Identification of the data to be checked of input.

For the data that adopt the storage of key-value form, when reading data to be checked, receive the key of user's input, the user also can the input inquiry word certainly, and system is converted to corresponding key with query word.

Step S202, calculate second subregion at described data to be checked place according to the Data Identification of described data to be checked, and obtain the Section Point under described the second subregion.

Adopt the method identical with step S102 to calculate the Partition at data to be checked place, according to the Partition that calculates, determine the Section Point at described data to be checked place.

Step S203, the index information that inquiry and the Data Identification of described data to be checked are complementary from the index area of described Section Point.

Index information in the described index area comprises the Data Identification of storing data, subregion and the address data memory at place.

From the index area of the described Section Point at described data to be checked place, the Data Identification of having stored data in the Data Identification key of data to be checked and this Section Point index area is complementary, obtain the index information that the key with these data to be checked is complementary, thereby obtain the address data memory (being the value address) of data to be checked.If there are a plurality of data to be checked to be stored on different a plurality of Section Points, the Data Identification that then from the index area of a plurality of Section Points, mates respectively data to be checked, obtain respectively the index information that is complementary on a plurality of Section Points, obtain the value address of multiple queries data.

Address data memory in the index information of the described data to be checked that step S204, basis match reads described data to be checked from described Section Point.

Wherein, if a plurality of data to be checked are arranged, after step S203, also comprise: the described address data memory to a plurality of described data to be checked sorts, and step S204 then reads described data to be checked according to ranking results order from the storage medium of described Section Point corresponding to described data to be checked.

When memory node increase or deletion appear in distributed memory system, need to carry out the migration operation of rebalance, partition can redistribute at memory node, thereby, the data processing method of the embodiment of the invention also comprises the step of Data Migration, partition to be migrated is carried out migration operation, the step of executing data migration on embodiment basis shown in Figure 1.Fig. 4 is the method flow diagram of the Data Migration step that provides of present embodiment, and as shown in Figure 4, described Data Migration step comprises:

Step S301, when satisfying the zoned migration condition preset, obtain subregion to be migrated.

The zoned migration condition that presets can comprise when node increase or knot removal occurring.

When node increased or deletes, system need to determine the partition of migration, then obtained the Partition to be migrated that system determines.For example, needing the partition of migration is partition 19, partition28, then obtains those Partition.

Certainly, if there is the node of additions and deletions, what system determined is each data to be migrated that need migration, and the key that then treats migration data carries out Hash calculation, obtains the partition at data to be migrated place.

The 3rd node under step S302, the described subregion to be migrated of acquisition.

Adopt the method identical with step S102 to obtain the 3rd affiliated node of Partition to be migrated.

Step S303, coupling obtains all index informations identical with described subregion to be migrated sign from the index area of described the 3rd node.

From described section post to be migrated the index area of described the 3rd node, the place subregion sign of having stored data in subregion sign to be migrated and the 3rd node index area is complementary, obtain the index information that is complementary with this subregion sign to be migrated.If a plurality of partitioned storages to be migrated are arranged on different a plurality of the 3rd nodes, coupling subregion to be migrated sign from the index area of a plurality of the 3rd nodes respectively then obtains respectively the index information of the subregion to be migrated that is complementary on a plurality of the 3rd nodes.

From the index area that step S104 forms, match the value index information identical with the value of Partition to be migrated of partition, comprise key key, Partition and value address.For example, match that all partition are the index information of partition 19, partition 28 in the index area.

When comprising a plurality of bucket ID in the index area, the scanning index district according to the bucket ID number of memory configurations, reads index information corresponding among the corresponding bucket ID in the internal memory in batches.

At this moment, this step is mated from the index area and is obtained the index information identical with the value of described subregion Partition to be migrated, comprising:

Read the index information at least one subindex district in the described index area in batches, be specially the index information that in batches reads different bucket bucket ID.

When reading, record the described Bucket ID that this reads, so that the initial Bucket ID that acquisition is read next time in batches at every turn in batches.

From the described index information that this reads, coupling obtains the index information identical with described subregion sign to be migrated.

The address data memory of step S304, described index information that coupling is obtained sorts, and ranking results is sent to described the 3rd node, reads described data to be migrated and migrates to destination node in order to described the 3rd node sequence.

A plurality of index informations that step S303 is matched sort according to the size of the value address in the index information.For example, the index information that matches comprises＜key11, partition_K11=19, value_LBA_K11=10 〉,＜key12, partition_K12=19, value_LBA_K12=40 〉,＜key22, partition_K22=19, value_LBA_K22=60 〉,＜key34, partition_K34=19, value_LBA_K 34=30 〉,＜key41, partition_K41=19, value_LBA_K41=20 〉.Be K11, K41, K 34, K12, K22 according to the value address result who obtains that sorts.Ranking results is issued hard disk, and hard disk then can read in order, needs scan full hard disk and the unordered situation that reads at random when having avoided reading according to existing method, promotes overall performance.

Described ranking results can but be not limited to organize according to tree structure, such as adopting B+ tree index or bitmap index etc. to organize.Fig. 5 is the data file layout synoptic diagram of the index area that provides of present embodiment, as shown in Figure 5, comprises multistage non-leafy node, in the value of non-leafy node storage value_LBA, at key and the value of the concrete storage data of leafy node storage.

When newly-increased index information is arranged, the value address that has index information in the value address in the index information that newly forms and the described index area is compared, to determine the new particular location of described index information on the B+ tree that forms.If current node compares in the value of newly-increased value address and the B+ tree, if increase newly larger, then be placed on the node of current node top; If newly-increased is less, then be placed on the node of current node bottom; By that analogy, with newly-increased index assignment on the B+ tree.

The data processing method that the embodiment of the invention provides is applicable to the key-value distributed memory system, when the storage data, key and value divided to open deposit, the address offset of data on the node hard disk stored in record in the index area of node, when migration data, only need the scanning index district, just need can find key and the value address of migration, reduce the number of times of hard disk I/O, to the data sorting to be migrated that obtains in batches, optimize the disk read-write order, promote overall performance.

More than be the detailed description that the data processing method that the embodiment of the invention provides is carried out, the below is described in detail the data processing equipment that the embodiment of the invention provides.

The data processing equipment that the embodiment of the invention provides is applied in the key-value storage system.Fig. 6 is the synoptic diagram of the data processing equipment that provides of present embodiment, as shown in Figure 6, the data processing equipment of the embodiment of the invention comprises: acquiring unit 701, computing unit 702, storage unit 703, indexing units 704, matching unit 705, reading unit 706 and sequencing unit 707.

This data processing equipment mainly comprises data storage, data query and three duties of Data Migration, and the below describes respectively.

When carrying out the data storage, the parts of groundwork comprise acquiring unit 701, computing unit 702, storage unit 703 and indexing units 704.

Acquiring unit 701 is used for obtaining the Data Identification of data to be stored and described data to be stored.

To store a certain microblogging data instance, the key of the data to be stored that acquiring unit 701 obtains adopts string representation, may comprise time, user profile and serial number, represent when certain user has sent out a certain microblogging, value then is concrete microblogging content, for example " have a meal at * * * today ".

For the data to be stored that adopt the key-value form to represent, 701 of acquiring units are to obtain these data to be stored itself (being value) and Data Identification key.

The Data Identification that computing unit 702 is used for obtaining according to acquiring unit 701 calculates first subregion at described data to be stored place, and obtains the first node under the first subregion.

Computing unit 702 obtains the eigenwert of described key, described eigenwert is used for the described key of unique expression, the method of obtaining the eigenwert of key can be that described key is carried out Hash calculation, obtain the cryptographic hash of described key, with the cryptographic hash that the obtains eigenwert as described Key, obtain the Partition at described data to be stored place according to the key eigenwert.

Then, computing unit 702 can carry out the consistance Hash calculation to the sign at data to be stored place, to determine corresponding node according to partition.If the Partition at the data to be stored place that computing unit 702 calculates is designated 70,10 nodes are arranged in the distributed type assemblies, carry out again delivery (mould 10) computing after, can determine that these data to be stored should be on first node.

Wherein, the data handling system of the embodiment of the invention can also comprise heavy unit (not shown), when going heavy unit for the node under computing unit 702 obtains described subregion, judge in the index area of described first node and whether have the index information identical with the Data Identification of described data to be stored, when not having the identical index information of described Data Identification in the described index area, trigger storage unit 703.When having the identical index information of described key in the described index area, go to heavy unit then not trigger storage unit 703 and store, illustrate and stored described data to be stored on this memory node.

Storage unit 703 is used for described Data Identification and the described storage data for the treatment of of described data to be stored are stored in respectively described first node, and the record data memory address.

Storage unit 703 is stored in respectively described first node with key and the value of data to be stored, and record value address.

Address data memory is the address offset amounts of described data to be stored on the hard disk of described node, is the value address, and the address offset amount of expression value on the hard disk of this node is as representing with the LBA address.

For the ease of operations such as subsequent query and migrations, wherein, the data-storage system of the embodiment of the invention also comprises indexing units 704, indexing units 704 is used for the first subregion sign and the address data memory generating indexes information with the Data Identification of described data to be stored, place, and this index information is added in the index area of described first node.

Index information in the index area of memory node is included in the index information of each storage data of having stored on this memory node, each bar index information comprises: the Data Identification of having stored data, stored the first subregion sign at data place and stored the address data memory of data, namely comprised: the key of storage data, partition and the value address information at place.

Indexing units 704 is determined the sign in the subindex district that the index information of described generation will be deposited according to the result who the Data Identification in the index information of described generation is carried out Hash calculation or according to the size order of Data Identification, and the index information of described generation is deposited in subindex district corresponding to described subindex district sign.

When newly-generated index information is arranged, indexing units 704 compares the key of existing index information in the index area of the key in the index information of described generation and described first node, put in order according to predefined, determine the memory location of key in described existing index information in the described index information of described generation, add the index information of described generation to described memory location.

Perhaps, when newly-generated index information is arranged, indexing units 704 also can be carried out Hash calculation to the key in the index information of described generation, the bucket ID that will deposit with the described index information of determining described generation, for example, can carry out delivery (mould m) to key and calculate, obtain the value of bucket ID, the described index information of this generation is deposited among the described bucket ID.

When carrying out data query, the parts of groundwork comprise acquiring unit 701, computing unit 702, matching unit 705 and reading unit 706.

Acquiring unit 701 is for the Data Identification of the data to be checked of obtaining input.Computing unit 702 calculates second subregion at described data to be checked place for the Data Identification of the data to be checked of obtaining according to acquiring unit 701, and obtains the Section Point under described the second subregion.The index area that matching unit 705 is used for from described Section Point, the index information that the Data Identification of inquiry and described data query is complementary, the index information in the index area of described Section Point comprise the Data Identification of storing data, subregion sign and the address data memory at place.Reading unit 706 reads described data to be checked for the address data memory of the index information of the described data to be checked that obtain according to described matching unit from described Section Point.

Matching unit 705 is from the index area of the described Section Point at described data to be checked place, the Data Identification of having stored data in the Data Identification key of data to be checked and this Section Point index area is complementary, obtain the index information that the key with these data to be checked is complementary, thereby obtain the address data memory (being the value address) of data to be checked.If there are a plurality of data to be checked to be stored on different a plurality of Section Points, the Data Identification that then from the index area of a plurality of Section Points, mates respectively data to be checked, obtain respectively the index information that is complementary on a plurality of Section Points, obtain the value address of multiple queries data.

Wherein, when inquiry during a plurality of data to be checked, can also comprise sequencing unit 707, sequencing unit 707 be used for to matching unit 705 for a plurality of described Data Matching to be checked to described address data memory sort.Reading unit 706 reads described data to be checked according to ranking results order from the hard disk of corresponding node of sequencing unit 707.

When carrying out Data Migration, the parts of groundwork comprise acquiring unit 701, computing unit 702, matching unit 705 and sequencing unit 707.

Acquiring unit 701 is used for obtaining subregion to be migrated when satisfying the zoned migration condition that presets.Computing unit 702 is used for obtaining the 3rd affiliated node of described subregion to be migrated.

When having the node of additions and deletions, system need to determine the partition of migration, and then acquiring unit 701 obtains the Partition to be migrated that system determines.For example, needing the partition of migration is partition19, partition 28, and 701 of acquiring units obtain those Partition.

If there is the node of additions and deletions, what get access to when acquiring unit 701 is each data to be migrated that need migration, the key that then utilizes computing unit 702 to treat migration data carries out Hash calculation, obtain the partition at data to be migrated place, and obtain the 3rd node under the described partition.

Matching unit 705 is used for obtaining the index information identical with the value of described subregion Partition to be migrated from the index area coupling of described the 3rd node.

Index information in the index area comprises the Data Identification of storing data, subregion and the address data memory at place, specifically comprises key, subregion sign and value address.Matching unit 705 matches the value index information identical with subregion sign to be migrated of partition.

Matching unit 705 from described section post to be migrated the index area of described the 3rd node, the place subregion sign of having stored data in subregion sign to be migrated and the 3rd node index area is complementary, obtains the index information that is complementary with this subregion sign to be migrated.If a plurality of partitioned storages to be migrated are arranged on different a plurality of the 3rd nodes, coupling subregion to be migrated sign from the index area of a plurality of the 3rd nodes respectively then obtains respectively the index information of the subregion to be migrated that is complementary on a plurality of the 3rd nodes.

When comprising a plurality of bucket ID in the index area, matching unit 705 is the scanning index district in batches, according to the bucket ID number of memory configurations, index information corresponding to corresponding bucket ID is read in the internal memory.

At this moment, matching unit 705 specifically comprises: in batches subelement and coupling subelement (not shown).Subelement is used for reading the index information at least one subindex district, described index area in batches in batches, is specially the index information that in batches reads different bucket bucket ID.Subelement is when reading at every turn in batches in batches, and records the described Bucket ID that this reads, in order to obtain the initial Bucket ID that reads next time in batches.The coupling subelement is used for from this described index information that reads of described in batches subelement, and coupling obtains the index information identical with described subregion sign to be migrated.

Sequencing unit 707 is used for the address data memory that matching unit 705 mates the described index information that obtains is sorted, and ranking results sent to described the 3rd node, read described data to be migrated and migrate to destination node in order to described the 3rd node sequence.

For example, be partition 19 when acquiring unit 701 gets access to the partition that need to move.Matching unit 705 matches that all partition are the index information of partition 19 in the index area.For example, the index information that matching unit 705 matches comprises＜key11, partition_K11=19, value_LBA_K11=10 〉,＜key12, partition_K12=19, value_LBA_K12=40 〉,＜key22, partition_K22=19, value_LBA_K22=60 〉,＜key34, partition_K34=19, value_LBA_K34=30 〉,＜key41, partition_K41=19, value_LBA_K41=20 〉.Sequencing unit 707 is K11, K41, K34, K12, K22 according to the value address result who obtains that sorts.Ranking results is issued hard disk, and hard disk then can read in order, needs scan full hard disk and the unordered situation that reads at random when having avoided reading according to existing method, promotes overall performance.

Fig. 7 is the synoptic diagram of a kind of data-storage system of providing of the embodiment of the invention, this storage system is to adopt the distributed memory system of key-value key-value form, as shown in Figure 7, this data-storage system comprises: storage administration node 10 and a plurality of data memory node 20.Finish mutual communication by bus between storage administration node 10 and the data memory node 20.Storage administration node 10 is data memory nodes that the distributed coordination system software is installed, in order to coordinate and to manage whole distributed memory system.

Fig. 8 is the synoptic diagram of a kind of storage administration node 10 of providing of the embodiment of the invention, storage administration node 10 may be the host server that comprises computing power, or personal computer PC, or portable portable computer or terminal etc., the specific embodiment of the invention is not done restriction to the specific implementation of storage administration node.As shown in Figure 8, storage administration node 10 comprises processor 101, communication interface 102, storer 103 and bus 104.

The processor 101 of storage administration node 10, communication interface 102, storer 103 is finished mutual communication by bus 104.Communication interface 102 is used for and net element communication, such as with data memory node 20 etc., is used for receiving or sending data storage, data query or Data Migration assignment instructions.Processor 101 is used for executive routine 1031, processor 101 may be a central processor CPU, or be configured to implement one or more integrated circuit of the embodiment of the invention or specific integrated circuit ASIC(Application Specific Integrated Circuit).Storer 103 is used for depositing program 1031.Storer 103 may comprise the high-speed RAM storer, also may also comprise nonvolatile memory (non-volatile memory), for example at least one magnetic disk memory.Wherein, program 1031 can comprise program code, and described program code comprises computer-managed instruction.As shown in Figure 9, program 1031 can comprise: computing unit 301.

When carrying out the data storage, communication interface 102 is used for the Data Identification that storage administration node 10 receives data to be stored and described data to be stored.The Data Identification that computing unit 301 is used for obtaining according to communication interface 102 calculates the first subregion that described data to be stored will be stored, and obtains the first node under described the first subregion.This first node is a node in the data memory node 20.According to the result of calculation of computing unit 301, data to be stored, the Data Identification of described data to be stored and the first subregion sign that definite data described to be stored will be stored that will obtain by communication interface 102 send to corresponding data memory node 20.

When carrying out data query, the communication interface 102 of storage administration node 10 receives the Data Identification of the data to be checked of input.Computing unit 301 calculates second subregion at described data to be checked place for the Data Identification of the data to be checked of obtaining according to communication interface 102, and obtaining the affiliated Section Point of described the second subregion, this Section Point is a node in the data memory node 20.According to the result of calculation of computing unit 301, the second subregion sign at the Data Identification of the data to be stored that will obtain by communication interface 102 and the data described to be checked place determined sends to corresponding data memory node 20.

When carrying out Data Migration, the communication interface 102 of storage administration node 10 receives subregion to be migrated.The sign that computing unit 301 is used for the subregion to be migrated that obtains according to communication interface 102 calculates the 3rd node under the subregion to be migrated.If what communication interface 102 received is the Data Identification of data to be migrated, then computing unit 301 calculates the 3rd subregion at described data to be migrated place according to described Data Identification, and obtains the 3rd node under described the 3rd subregion.Described the 3rd node is a node in the data memory node 20.According to the result of calculation of computing unit 301, the subregion to be migrated that will obtain by communication interface 102 identifies, or the 3rd subregion sign at the Data Identification of data to be migrated and the data described to be migrated place of determining sends to corresponding data memory node 20.

Figure 10 is the synoptic diagram of a kind of data memory node of providing of the embodiment of the invention, data memory node 20 may be the host server that comprises computing power, or personal computer PC, or portable portable computer or terminal etc., the specific embodiment of the invention is not done restriction to the specific implementation of data memory node.As shown in figure 10, data memory node 20 comprises processor 201, communication interface 202, storer 203 and bus 204.

The processor 201 of data memory node 20, communication interface 202, storer 203 is finished mutual communication by bus 204.Communication interface 202 is used for and net element communication, such as with storage administration node 10 etc., is used for the communication information that the communication interface 102 of receiving, storing and managing node 10 sends.Processor 201 is used for executive routine 2031, processor 201 may be a central processor CPU, or be configured to implement one or more integrated circuit of the embodiment of the invention or specific integrated circuit ASIC(Application Specific Integrated Circuit).Storer 203 is used for depositing program 2031.Storer 203 may comprise the high-speed RAM storer, also may also comprise nonvolatile memory (non-volatile memory), for example at least one magnetic disk memory.Wherein, as shown in figure 11, program 2031 can comprise: storage unit 401, indexing units 402, remove heavy unit 403, matching unit 404, reading unit 405 and sequencing unit 406.

When carrying out the data storage, communication interface 202 is used for the Data Identification of the data described to be stored of communication interface 102 transmissions of receiving, storing and managing node 10, described data to be stored and the first subregion sign that definite data described to be stored will be stored.Storage unit 401 is used for Data Identification and the described data to be stored of described data to be stored are stored in respectively storer 203, and the record data memory address.Indexing units 402 is used for the first subregion sign at the Data Identification of data to be stored that communication interface 202 is obtained, place that computing unit 301 is determined and the address data memory generating indexes information of storage unit 401 records, and this index information is added in the index area of this memory node 20, be recorded on the storer 203.

Wherein, before storage unit 401, also comprise heavy unit 403, go to heavy unit 403 to be used for judging whether the index area of this memory node 20 exists the index information identical with the Data Identification of described data to be stored, when not having the identical index information of described Data Identification in the described index area, trigger storage unit 401 and store; When not having the identical index information of described Data Identification in the described index area, then do not trigger storage unit 401 and store.

When carrying out data query, communication interface 202 is used for the second subregion sign at communication interface 102 Data Identification described to be stored that sends and the data described to be checked place of determining of receiving, storing and managing node 10.Matching unit 404 is used for the index information that is complementary from the Data Identification of the index area inquiry of notebook data memory node and described data query, and the index information in the described index area comprises the Data Identification of storing data, subregion and the address data memory at place.Reading unit 405 reads described data to be checked for the address data memory of the index information of the described data to be checked that obtain according to matching unit 404 from the address data memory of storer 203 correspondences.

Wherein, when a plurality of data to be checked are processed, also comprise sequencing unit 406, sequencing unit 406 be used for to matching unit 404 for a plurality of described Data Matching to be checked to described address data memory sort, ranking results is sent to reading unit 405.Reading unit 405 is according to the ranking results of sequencing unit 406, and order reads described data to be checked from address data memory corresponding to described data to be checked.

When carrying out Data Migration, communication interface 202 is used for the subregion sign to be migrated of communication interface 102 transmissions of receiving, storing and managing node 10, or the 3rd subregion sign at the Data Identification of data to be migrated and the data described to be migrated place of determining.The index area that matching unit 404 is used for from the notebook data memory node is complementary the place subregion sign of having stored data in subregion sign to be migrated and the index area, obtains the index information that is complementary with this subregion sign to be migrated.Sequencing unit 406 is used for the address data memory that matching unit 404 mates the described index information that obtains is sorted, and ranking results is sent to storer 203, reads described data to be migrated and migrates to destination node in order to order.If a plurality of partitioned storages to be migrated are arranged on different a plurality of the 3rd nodes, coupling subregion to be migrated sign from the index area of a plurality of the 3rd nodes respectively then obtains respectively the index information of the subregion to be migrated that is complementary on a plurality of the 3rd nodes.

The data processing method that the embodiment of the invention provides and device, in the index area of data, increase at least＜key, partition, value_LBA〉information, key and value divided to open deposit, by value address (value_LBA) sorted, but make original random access hard disk into the sequential access hard disk, need not to carry out scan full hard disk, promote overall performance, in addition, can pass through the batch scanning index area, find fast corresponding the owning＜key of partition that needs migration, value_LBA 〉, be convenient to concurrent, batch operation is supported breakpoint transmission, improve the utilization factor of bandwidth between node, improve the data reading performance.

The professional should further recognize, unit and the algorithm steps of each example of describing in conjunction with embodiment disclosed herein, can realize with electronic hardware, computer software or the combination of the two, for the interchangeability of hardware and software clearly is described, composition and the step of each example described in general manner according to function in the above description.These functions are carried out with hardware or software mode actually, depend on application-specific and the design constraint of technical scheme.The professional and technical personnel can specifically should be used for realizing described function with distinct methods to each, but this realization should not thought and exceeds scope of the present invention.

The method of describing in conjunction with embodiment disclosed herein or the step of algorithm can use the software module of hardware, processor execution, and perhaps the combination of the two is implemented.Software module can place the storage medium of any other form known in random access memory (RAM), internal memory, ROM (read-only memory) (ROM), electrically programmable ROM, electrically erasable ROM, register, hard disk, moveable magnetic disc, CD-ROM or the technical field.

Above-described embodiment; purpose of the present invention, technical scheme and beneficial effect are further described; institute is understood that; the above only is the specific embodiment of the present invention; the protection domain that is not intended to limit the present invention; within the spirit and principles in the present invention all, any modification of making, be equal to replacement, improvement etc., all should be included within protection scope of the present invention.

Claims

1. a data processing method is characterized in that, is applied to key-value key-value storage system, and described method comprises:

2. data processing method according to claim 1, it is characterized in that, described index area comprises at least one subindex district, according to the described Data Identification in the index information of described generation being carried out the result of Hash calculation or determining the subindex district that the index information of described generation will be deposited according to the size order of Data Identification, the index information of described generation is deposited in described definite described subindex district.

3. data processing method according to claim 1 and 2, it is characterized in that, with the Data Identification of described data to be stored, the first subregion sign and the address data memory generating indexes information at place, and this index information is added in the index area of described first node, comprising:

4. data processing method according to claim 1 and 2 is characterized in that, after the first node under obtaining described the first subregion, also comprises:

5. data processing method according to claim 1 and 2 is characterized in that, also comprises:

Obtain the Data Identification of the data to be checked of input;

6. data processing method according to claim 5 is characterized in that, read described data to be checked from described Section Point before, also comprises:

7. data processing method according to claim 1 and 2 is characterized in that, described method also comprises:

Obtain the 3rd affiliated node of described subregion to be migrated;

8. a data processing equipment is characterized in that, is applied to key-value key-value storage system, and described device comprises:

9. data processing equipment according to claim 8, it is characterized in that, described index area comprises at least one subindex district, according to the described Data Identification in the index information of described generation being carried out the result of Hash calculation or determining the subindex district that the index information of described generation will be deposited according to the size order of Data Identification, the index information of described generation is deposited in described definite described subindex district.

10. according to claim 8 or 9 described data processing equipments, it is characterized in that, described indexing units compares the Data Identification of existing index information in the index area of the Data Identification in the index information of described generation and described first node, put in order according to predefined, determine the memory location in described existing index information of the Data Identification in the index information of described generation, add the index information of described generation to described memory location.

11. according to claim 8 or 9 described data processing equipments, it is characterized in that described device also comprises:

12. according to claim 8 or 9 described data processing equipments, it is characterized in that described acquiring unit also is used for obtaining the Data Identification of the data to be checked of input;

Described device also comprises:

13. data processing equipment according to claim 12 is characterized in that, described device also comprises:

14. according to claim 8 or 9 described data processing equipments, it is characterized in that described acquiring unit also is used for obtaining subregion to be migrated when the satisfied zoned migration condition that presets;

Described device also comprises: