WO2014180178A1

WO2014180178A1 - Method, apparatus and system for establishing secondary index in distributed storage system

Info

Publication number: WO2014180178A1
Application number: PCT/CN2014/072044
Authority: WO
Inventors: Anoop Sam JOHN; Ramkrishna S VASUDEVAN; Jieshan BI; Priyank Ashok RASTOGI; Rajeshbabu CHINTAGUNTLA
Original assignee: Huawei Technologies Co., Ltd.
Priority date: 2013-05-07
Filing date: 2014-02-13
Publication date: 2014-11-13
Also published as: JP6127206B2; JP2016522928A

Abstract

The present invention relates to a method for establishing secondary index in a distributed storage system, including: receiving user data sent by a user client, the user data containing key part and value part; determining a user table according to the user data; writing the user data into the user table; generating index data according to the user data; writing the index data into an index table corresponding to the user table, wherein the user table and the index table are established in one node of the distributed storage system and are bonded in one-to-one manner. The present invention exist in that by binding user child tables and indexes child table, the two remote writing operation acrossing nodes is reduced to one, the number of network requests is reduced greatly, and the performance is improved accordingly.

Description

METHOD, APPARATUS AND SYSTEM FOR ESTABLISHING SECONDARY INDEX IN DISTRIBUTED

STORAGE SYSTEM

TECHINICAL FIELD

This application relates to distributed storage system, in particular, to a method, apparatus and system for establishing secondary index in a distributed storage system. BACKGROUND ART

In the majority of the distributed storage system, the storing method of Key- Value type has been accepted, i.e. the real user data to be stored is stored in a value part, and then, a Key will be constructed to search the corresponding Value.

For example, assume that the distributed storage system is used to store data in an online trading system, the Key and the Value can be designed as below.

Key: user code + trading time;

Value: detailed information for trading.

The real user data is natural ordered in an order of dictionary, when it is stored into the distributed storage system, such that data of the same user code are stored adjacently. To obtain all the records with the user code as "xxxx" and the trading time as in a certain period of time, an index scheme in the distributed storage system can be used to search the desired data quickly.

In the distributed storage system, the concept of table is provided. If a user wants to write data into the distributed storage system, he/she should build a data table first according to his/her own requirement. But vast amounts of data may be contained in one data table. To implement the distributed storage in the distributed storage system, the mainstream method is to generate a plurality of child tables by transversely cutting a data table, and to manage and maintain the plurality of child tables. The child table can be defined as follows:

1, a child table is a cluster of a number of consecutive rows, and each child table has a key value range;

2, a database table usually consists of one or more child table(s);

3, the child table is the smallest unit of distributed storage and load balancing;

4, increases to a certain extent, the child table will automatically split into two child tables.

Figure 1 is a schematic diagram of a data table, as shown in figure 1, the data table consists of M child tables, each child table is responsible for a key value range. The key to which each key value generated by a user corresponds, can only fall on the key value range of one child table. Key value ranges of different child tables have no intersection.

For each child table, it need to be allocated and can only be uniquely allocated to one region server to be managed. The region server is a node of cluster of the distributed storage system, it is usually a physical server.

Figure 2 is a schematic diagram of a cluster of a distributed storage systems, as shown in figure 2, the cluster of the distributed storage systems includes three region servers, three child tables (child table 1, child table 4, and child table 9) stored in a first region server, three child tables (child table 2, child table 6, and child table 7) stored in a second region server, and three child tables (child table 3, child table 5, and child table 8) stored in a third region server.

In the distributed storage system, all of the user data are stored in the underlying file system by the form of key value. In order to read a user data, one method is to specify an exact key to perform precisely search, another method is to specify a former part of the key to perform vaguely data scan. However, the two methods are both based on the value of the key, when a user wants to search user data based on the conditions of the value, it is needed to search the whole data table to find the records, the performance is very poor.

To solve this problem, secondary index scheme is provided. Figure 3 is a schematic diagram of scheme of a traditional secondary index scheme, as shown in figure 3, the secondary index scheme is based on the client API (Application Programming Interface) package of the distributed storage systems. In the secondary index scheme, a separate index table is used to store the index value, and when a user writes data, two types of data can be generated by using the packaged client API, one is user data, which is written into the user table, the other is index data, which is written into a index table.

In the implementation of the present invention, the inventor found that since the traditional secondary index scheme is packaged to the client API, which leading to the poor performance. Although there are some other improved schemes, but to write a data, the two remote write operation need to be transmitted between the client and remote distributed storage system, such as write the user table and the index table is needed, and it can not improve the performance of write data fundamentally. SUMMARY OF THE INVENTION

In view of the problems pointed out in the Background Art, the present invention is proposed.

The main object of the embodiments of the present invention is to provide a method, apparatus and system for establishing secondary index in a distributed storage system, so as to reducing the number of network requests.

According to an aspect of the embodiments of the present invention, there is provided a method for establishing secondary index in a distributed storage system, wherein, the method comprises,

receiving user data sent by a user client, the user data containing key part and value part;

determining a user table according to the user data;

writing the user data into the user table;

generating index data according to the user data;

writing the index data into an index table corresponding to the user table, wherein the user table and the index table are established in one node of the distributed storage system and are bonded in one-to-one manner.

According to another aspect of the embodiments of the present invention, there is provided an apparatus for establishing secondary index in a distributed storage system, wherein, the apparatus comprises,

a receiving unit configured to receive user data sent by a user client, the user data containing key part and value part;

a determining unit configured to determine user table according to the user data; a first writing unit configured to write the user data into the user table;

a generating unit configured to generate index data according to the user data;

a second writing unit configured to write the index data into an index table

corresponding to the user table, wherein the user table and the index table are established in one node of a distributed storage system and are bonded in one-to-one manner.

According to still another aspect of the embodiments of the present invention, there is provided a distributed storage system, wherein, the system comprises,

at least one region server, each of the region server storing at least one user table and at least one index table corresponding to the at least one user table, wherein, the range key of each of the user table is identical to that of index table corresponding to the user table; secondary index establishing apparatus as described above.

The advantages of the embodiments of the present invention exist in that by binding user child tables and indexes child table, the two remote writing operation acrossing nodes is reduced to one, the number of network requests is reduced greatly, and the performance is improved accordingly.

Particular embodiments of the present invention will be described in detail below with reference to the following description and attached drawings and the manners of using the principle of the present invention are pointed out. It should be understood that the implementation of the present invention is not limited thereto in scope. Rather, the invention includes all changes, modifications and equivalents coming within the spirit and terms of the appended claims.

Features which are described and/or illustrated with respect to one embodiment may be used in the same way or in a similar way in one or more other embodiments and/or in combination with or instead of the features of the other embodiments.

It should be emphasized that the term "comprises/comprising" when used in this specification is taken to specify the presence of stated features, integers, steps or components but does not preclude the presence or addition of one or more other features, integers, steps, components or groups thereof.

BRIEF DESCRIPTION OF THE ACCOMPANYING DRAWINGS

The drawings are included to provide further understanding of the present invention, which constitute a part of the specification and illustrate the preferred embodiments of the present invention, and are used for setting forth the principles of the present invention together with the description. The same element is represented with the same reference number throughout the drawings.

Figure 1 is a schematic diagram of a data table;

Figure 2 is a schematic diagram of a cluster of a distributed storage systems;

Figure 3 is a schematic diagram of scheme of a traditional secondary index scheme; Figure 4 is a schematic diagram of a distributed storage system according to an embodiment of the present invention;

Figure 5 is a flowchart of the method for establishing secondary index according to an embodiment of the present invention;

Figure 6a is a schematic diagram of an embodiment of a user child table; Figure 6b is a schematic diagram of an embodiment of the index data table;

Figure 6c is a schematic diagram of another embodiment of the index data table; Figure 7 is a schematic diagram of a user child table and its corresponding index child table stored in one region server;

Figure 8 is a schematic diagram of an secondary index establishing apparatus;

Figure 9 is a schematic diagram of a distributed storage system.

DESCRIPTION OF THE INVENTION

The foregoing and other features of the embodiments of the present invention will be apparent through the following description with reference to the drawings. These embodiments are merely illustrative and not intended to limit the present invention. For the easy understanding of the principle and the embodiments of the present invention by those skilled in the art, the description of the embodiments of the present invention will be given taking a online trading system as an example; however, it should be understood that the embodiments of the present invention are not limited to such a system.

Figure 4 is a schematic diagram of a distributed storage system according to an embodiment of the present invention, as shown in figure 4, the distributed storage system includes two region servers, i.e. region server 41 and region server 42. Each region server includes a user table and an index table. The user table and the index table established in one region server are bonded in one-to-one manner, i.e. a key range of the user table and that of the index table are the same. Each user table may be divided into multiple user child tables, and each index table may be divided into multiple index child tables. And one of the multiple index child tables corresponds to one of the multiple user child tables, the number of the multiple user child tables in one region server is identical with the number of multiple index child tables in the same region server. The user child tables and the index child tables established in one region server are bonded in one-to-one manner, i.e. a key range of each user child table is identical with that of its corresponding index child table. In the embodiment of the present invention, a user table 41 1 and an index table 412 are established in the region server 41, a user table 421 and an index table 422 are established in the region server 42.

Where, the user table 41 1 includes a user child table 41 1 1 and a user child table 41 12, the index table 412 includes an index child table 4121 and an index child table 4122, in the embodiment of the present invention, the user child table 41 1 1 and the index child table 4121 are bonded in one-to-one manner, i.e. the key range of the user child table 41 1 1 and that of the index child table 4121 are the same. Similarly, the user child table 41 12 and the index child table 4122 are bonded in one-to-one manner.

On the other hand, user table 412 includes user child table 421 1 and user child table 4212, index table 422 includes index child table 4221 and index child table 4222, similarly, in the embodiment of the present invention, the user child table 421 1 and the index child table 4221 are bonded in one-to-one manner, the user child table 4212 and the index child table 4222 are bonded in one-to-one manner.

In the embodiment of the present invention, the user table and the index table are established in a same node(region server) and bonded in one-to-one manner, when user data is written by a user, the user side only need to send the data related to the user table and do not need to send the data related to the index table, that is, only one operation of writing the user table is needed and the other operation of writing the index table is avoided, the data of index table can be written by in-process calling, and the performance of the data writing stage can be enhanced, especially suitable for the user scenes where the requirement of data writing performance is high.

The preferred embodiments of the present invention are described as follows in reference to the drawings.

Embodiment 1

An embodiment of the present invention provides a method for establishing secondary index in a distributed storage system. Figure 5 is a flowchart of the method; referring to Fig. 5, the method comprises:

step 501 : receiving user data sent by a user client, the user data containing key part and value part;

step 502: determining a user table according to the user data;

step 503 : writing the user data into the user table;

step 504: generating index data according to the user data;

step 505 : writing the index data into an index table corresponding to the user table, wherein the user table and the index table are established in one node of the distributed storage system and are bonded in one-to-one manner.

In the step 501, each table has a key range from a start key to an end key, a range key of the user data is usually in the scope from the start key to the end key of the table. In generally, in the embodiment of the present invention, the user data contains the key part and the value part, the key part of the user data may be divided into two parts, one is the range key, and the other is a specific key, the range key corresponds to any key from the start key to the end key and is used to indicate to which user table the user data is belonged, the specific key is used to indicate the user data uniquely. Thus according to the range key of the key part of the user data, the distributed storage system can determine which user table is needed to store the user data.

In the step 503, after determine the user table, the distributed storage system can perform the operation of writing the user data, the method of writing the user data is the same with the prior art, which shall not be described any further.

In the step 504, the distributed storage system can generate the index data by using the user data, which can be achieved by in-process calling, thus another operation request of the distributed storage system for the index data from the user client can be avoided, and the method of generating the index data by the distributed storage system can save network resource and be more efficient comparing to the conventional user client API packaging scheme.

The generated index data comprises key part, the key part of the index data comprises two parts, a range key and a specific key, the range key of the index data is used to indicate to which index table the index data is belonged, the specific key of the index data is used to indicate the index data uniquely. In the embodiment of the present invention, since the key range of the user table and that of the index table are the same, the range key of the index data may correspond to the range key of the user data, or the range key of the index data may adopt the start key of the user table, or the range key of the index data may adopt the end key of the user table.

In a concrete embodiment of step 504, the distributed storage system extracts the range key of the key part of the user data as the range key of the key part of the index data. The details will be described below.

In the step 505, since user table and index table has the relationship of one-to-one, the distributed storage system can write the index data into the index table corresponding to the user table.

Since each user table may be divided into multiple user child tables, and each index table may be divided into multiple index child tables. In the Step 502, the distributed storage system further determines a user child table according to the user data, and in the step 504, writes the user data into the user child table, and in the step 505, writes the index data into an index child table corresponding to the user child table.

Figure 6a is a schematic diagram of an embodiment of a user child table, in the embodiment, one user data is stored in the user child table, similar to the user data, the user child table includes key part and value part too, the key part of the user child table is used to store the key part of the user data, and the value part of the user child table is used to store the value part of the user data. As shown in figure 6a, the key part of the user child table has one column named "key", and the value part of the user child table has one column named "seller code", it should be understood that the embodiment is not limited thereto, in other embodiments, the value part of the user child table can also include more than one column. As shown in figure 6a, the user data includes key part and value part, in the embodiment, the key part of the user data is "buyeral20120701 1000", the value part of the user data is "sellercl". The key part of the user data comprises two parts, one is a range key, and the other is a specific key, in the embodiment shown in figure 6a, the range key is "buyera" and the specific key is "20120701 1000". Figure 6b is a schematic diagram of an embodiment of an index child table, in the embodiment, the index child table corresponds to the user child table shown in figure 6a, which means the key range of the index child table is identical to that of the user child table. As shown in figure 6b, the index data is generated according to the user data stored in the user child table as shown in figure 6a. As shown in figure 6b, the index data includes key part and value part too, the value part is null, and the key part of the index data comprises a range key and a specific key. The range key of the index data is "buyera" which is the start key of the user table in figure 6a, or the range key of the user data in figure 6a. Moreover, the specific key of the index data is "sellercl buyeral20120701 1000", where, "sellercl" is a value of an index column in the value part of the user data, "buyeral20120701 1000" is the key part of the user data. As shown in figure 6b, the key part of the index data in this embodiment is "buyera sellercl buyeral20120701 1000".

In another embodiment, a column name of an index column of the user child table can also be included into the index data, as shown in figure 6c, different from the embodiment of figure 6b is that the name of the column is included into the index data which is located between the "buyera" and the "sellercl", and the key part of the index data in this embodiment is "buyera sellercode sellercl buyeral20120701 1000".

In one embodiment of the present invention, the range key of the user data (or the start key of the user child table or the end key of the user child table), the value part of the user data, and the key part of the user data being sequentially arranged in the index data. In another embodiment of the present invention, the value part of the user data includes the value of the index column.

In the embodiment shown in figure 6b and 6c, the value part of the index child table is empty, i.e. the index data only include key part, but it should be understood that the embodiment is not limited thereto, in other embodiments, the value part of the index data can be filled with other useful information according to actual requirement.

In one embodiment of the present invention, as shown in figure 4, the user child table and the index child table corresponding to the user child table are stored on a same region server.

For the easy understanding of the embodiment of the present invention, the detailed description with reference to the drawings will be given as follows.

Assume that there are three user child tables in a user table, and the key range of each user child table is,

user child table 1 : [a, b);

user child table 2: [b, c);

user child table 3 : [c, d).

Accordingly, there are three index child tables in an index table, and the key range of each index child table is the same as that of the user child table to which the index child table corresponds.

Index child table 1 : [a, b);

index child table 2: [b, c);

index child table 3 : [c, d).

That is, user child table 1 corresponds to index child table 1, user child table 2 corresponds to index child table 2, user child table 3 corresponds to index child table 3.

As described above, since all the data are stored in the distributed storage system in an order of dictionary, if user data with key "aOOO l" is written by a user, apparently, the user data should be written into the user child table 1, because the user data belongs to the key range of [a, b) of user child table 1. According to the embodiment of the present invention, as described in step 501 - 503, the user data was written into the user child table 1.

In the embodiment, as described in step 504 - 505, a corresponding index data shall be generated and be stored into the index child table 1. That is, the corresponding index data belongs to the key range of [a, b) of the index child table 1 too.

According to the embodiment of the present invention, the range key of the user data

"a" is put into the start part of key part of the index child tables, to ensure that the generated index data belongs to the corresponding index child table.

In the embodiment, the structure of the index data is defined as: range key of corresponding user data + value part of user data (name of index column or name and value of index column) + key part of user data. This is only an example, and the present invention is not limited thereto. For example, except for the range key of the

corresponding user data, the position of the name of index column, the value of index column, and the key part of user data can be any combination thereof.

Figure 7 is a schematic diagram of a user child table and its corresponding index child table stored in one region server. As shown in figure 7, the key range of the user child table 1 is [buyera, buyerb), accordingly, the key range of the corresponding index child table 1 is [buyera, buyerb) too. According to the embodiment of the present invention, three records of user data are written into the user child table 1, accordingly, three records of index data are generated and are written into the index child table 1. For the third records, the key part of the user data is "buyeral201208101 100", the value part of the user data is "sellerc3", accordingly, the key part of the index data is "buyera sellercode sellerc3 buyeral201208101 100", thus, the range key of the user data is written into the key part of the index data, when a user wants to get all the trading records of seller code "sellerc3", since the records are stored in the index child table 1 adjacently, so it is easy to read the records, by decoding the data, the user can get the user data "buyeral201208101 100", so as to get desired data by searching the user child table according to the user data.

According to the embodiment of the present invention, by binding user child tables and indexes child table, the two remote writing operation acrossing nodes is reduced to one, and data reading is no longer a across- node data required operation, the number of network requests is reduced greatly, and the performance is improved accordingly.

The embodiment of the present invention further provides an secondary index establishing apparatus, and as described in the following embodiment 2, since the principle of the apparatus for solving the problem is the same as that of the method for establishing secondary index in embodiment 1, the specific implementation thereof can refer to the implementation of the method of embodiment 1 , and the similarities will not be described any further.

Embodiment 2

An embodiment of the present invention further provides an secondary index establishing apparatus. Figure 8 is a schematic diagram of the apparatus, referring to the figure 8, the apparatus comprises:

a receiving unit 81 configured to receive user data sent by a user client, the user data containing key part and value part;

a determining unit 82 configured to determine user table according to the user data; a first writing unit 83 configured to write the user data into the user table; a generating unit 84 configured to generate index data according to the user data; a second writing unit 85 configured to write the index data into an index table corresponding to the user table, wherein the user table and the index table are established in one node of a distributed storage system and are bonded in one-to-one manner.

In the embodiment, a key range of the user table and that of the index table are the same.

In the embodiment, the user table comprises multiple user child tables, and the index table comprises multiple index child tables; the number of the multiple user child tables is identical with the number of the multiple index child tables, a key range of each user child table is identical to that of its corresponding index child table.

In the embodiment, the first writing unit 83 includes a first determining module 831 and a first writing module 832, where, the first determining module 831 configured to determine a user child table according to the user data; and the first writing module 832 configured to write the user data in the user child table.

In the embodiment, the second writing unit 85 includes a second writing module 851, the second writing module configured to write the index data into an index child table corresponding to the user child table.

In the embodiment, the generating unit 84 includes a second determining module 841 and a generating module 842, where, the second determining module 841 configured to determine a range key of the index data and a specific key of the index data, wherein the range key of the index data comprises a range key of the user data or a start key of the user child table or an end key of the user child table, the specific key of the index data comprises the value part of the user data and the key part of the user data; the generating module 842 configured to generate index data according to the range key of the index data and the specific key of the index data. Where, the second determining module 841 can determine the range key of the index data by extracting the range key of the user data or the start key of the user child table or the end key of the user child table. Where, the value part of the user data adopted in the index data is related to a value of the index column, and the index data further comprises a name of the index column in the user child table.

In the embodiment, the name of index column, the value of the index column and the key part of the user data being sequentially arranged in the index data.

Embodiment 3

An embodiment of the present invention further provides a distributed storage system. Figure 9 is a schematic diagram of the system, referring to the figure 9, the system includes at least one region server 91 and a secondary index establishing apparatus 92, where,

each of the region servers stores at least one user table and at least one index table corresponding to the at least one user table, wherein, the key range of each of the user table is identical to that of index table corresponding to the user table;

the secondary index establishing apparatus 92 can be realized by the apparatus as described in embodiment 2, and the contents in embodiment 2 are incorporated herein, which shall not be described any further.

Where, the storage form of the user table and its corresponding index table stored in the same region server 91 is the same as described in figure 4, and the contents are incorporated herein, which shall not be described any further.

In the distributed storage system, considering to the load balancing during the operation or the damage of the region server, the migration of the child table is happened usually, when the child table is migrating to another region server, the user child table and the index child table are migrated together.

The preferred embodiments of the present invention are described above with reference to the figures. The many features and advantages of the embodiments are apparent from the detailed specification and, thus, it is intended by the appended claims to cover all such features and advantages of the embodiments that fall within the true spirit and scope thereof. Further, since numerous modifications and changes will readily occur to those skilled in the art, it is not desired to limit the inventive embodiments to the exact construction and operation illustrated and described, and accordingly all suitable modifications and equivalents may be resorted to, falling within the scope thereof.

It should be understood that each of the parts of the present invention may be implemented by hardware, software, firmware, or a combination thereof. In the above embodiments, multiple steps or methods may be realized by software or firmware that is stored in the memory and executed by an appropriate instruction executing system. For example, if it is realized by hardware, it may be realized by any one of the following technologies known in the art or a combination thereof as in another embodiment: a discrete logic circuit having a logic gate circuit for realizing logic functions of data signals, application-specific integrated circuit having an appropriate combined logic gate circuit, a programmable gate array (PGA), and a field programmable gate array (FPGA), etc.

The description or blocks in the flowcharts or of any process or method in other manners may be understood as being indicative of comprising one or more modules, segments or parts for realizing the codes of executable instructions of the steps in specific logic functions or processes, and that the scope of the preferred embodiments of the present invention comprise other implementations, wherein the functions may be executed in manners different from those shown or discussed, including executing the functions according to the related functions in a substantially simultaneous manner or in a reverse order, which should be understood by those skilled in the art to which the present invention pertains.

The logic and/or steps shown in the flowcharts or described in other manners here may be, for example, understood as a sequencing list of executable instructions for realizing logic functions, which may be implemented in any computer readable medium, for use by an instruction executing system, device or apparatus (such as a system including a computer, a system including a processor, or other systems capable of extracting

instructions from an instruction executing system, device or apparatus and executing the instructions), or for use in combination with the instruction executing system, device or apparatus. As used herein, "a computer readable medium" can be any device that can contain, store, communicate with, propagate or transmit programs for use by an instruction executing system, device or apparatus, or can be used with the instruction executing system, device or apparatus. A computer readable medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared or semiconductor system, device, apparatus, or a propagation medium. More particular examples

(inexhaustive lists) of a computer readable medium may comprise the following: an electrical connecting portion (electronic device) having one or more wirings, a portable computer hardware box (magnetic device), a random access memory (RAM) (electronic device), a read-only memory (ROM) (electronic device), an erasable programmable read-only memory (EPROM or flash memory) (electronic device), an optical fiber (optical device), and a portable compact disk read-only memory (CDROM) (optical device).

Furthermore, a computer readable medium may be paper or other appropriate media on which the programs may be printed, as the programs may be obtained electronically through scanning optically the paper or other appropriate media and then compiling, interpreting, or processing in other appropriate manners, as necessary, and then the programs are stored in the computer memory.

The above literal description and drawings show various features of the present invention. It should be understood that a person of ordinary skill in the art may prepare suitable computer codes to carry out each of the steps and processes described above and illustrated in the drawings. It should also be understood that the above-described terminals, computers, servers, and networks, etc. may be any type, and the computer codes may be prepared according to the disclosure contained herein to carry out the present invention by using the devices.

Particular embodiments of the present invention have been disclosed herein. Those skilled in the art will readily recognize that the present invention is applicable in other environments. In practice, there exist many embodiments and implementations. The appended claims are by no means intended to limit the scope of the present invention to the above particular embodiments. Furthermore, any reference to "a device to... " is an explanation of device plus function for describing elements and claims, and it is not desired that any element using no reference to "a device to... " is understood as an element of device plus function, even if the wording of "device" is included in that claim.

Although a particular preferred embodiment or embodiments have been shown and the present invention has been described, it is obvious that equivalent modifications and variants are conceivable to those skilled in the art in reading and understanding the description and drawings. Especially for various functions executed by the above elements (portions, assemblies, apparatus, and compositions, etc.), except otherwise specified, it is desirable that the terms (including the reference to "device") describing these elements correspond to any element executing particular functions of these elements (i.e. functional equivalents), even though the element is different from that executing the function of an exemplary embodiment or embodiments illustrated in the present invention with respect to structure. Furthermore, although the a particular feature of the present invention is described with respect to only one or more of the illustrated embodiments, such a feature may be combined with one or more other features of other embodiments as desired and in consideration of advantageous aspects of any given or particular application.

Claims

1. A method for establishing secondary index in a distributed storage system, comprising,

determining a user table according to the user data;

writing the user data into the user table;

generating index data according to the user data;

2. The method as claimed in claim 1, wherein, a key range of the user table and that of the index table are the same.

3. The method as claimed in claim 2, wherein, the user table comprises multiple user child tables, and the index table comprises multiple index child tables; the number of the multiple user child tables is identical to the number of the multiple index child tables, a key range of each user child table is identical to that of its corresponding index child table.

4. The method as claimed in claim 3, wherein,

the step of writing the user data into the user table comprises,

determining a user child table according to the user data; and

writing the user data into the user child table;

the step of writing the index data into an index table comprises,

writing the index data into an index child table corresponding to the user child table.

5. The method as claimed in claim 4, wherein, the step of generating index data according to the user data comprises: determining a range key of the index data and a specific key of the index data, wherein the range key of the index data comprises a range key of the user data or a start key of the user child table or an end key of the user child table, the specific key of the index data comprises the value part of the user data and the key part of the user data;

generating the index data according to the range key of the index data and the specific key of the index data.

6. The method as claimed in claim 5, wherein, the step of determining a range key of the index data comprises:

extracting the range key of the user data or the start key of the user child table or the end key of the user child table.

7. The method as claimed in claim 5, wherein, the value part of the user data adopted in the index data is related to a value of an index column.

8. The method as claimed in claim 7, wherein, the index data further comprises a name of an index column in the user child table.

9. The method as claimed in claim 8, wherein, the name of the index column, the value of the index column and the key part of the user data being sequentially arranged in the index data.

10. An secondary index establishing apparatus, comprising,

a determining unit configured to determine user table according to the user data;

a first writing unit configured to write the user data into the user table;

a generating unit configured to generate index data according to the user data;

a second writing unit configured to write the index data into an index table

1 1. The apparatus as claimed in claim 10, wherein,

a key range of the user table and that of the index table are the same.

12. The apparatus as claimed in claim 1 1, wherein,

the user table comprises multiple user child tables, and the index table comprises multiple index child tables;

the number of the multiple user child tables is identical to the number of the multiple index child tables, a key range of each user child table is identical to that of its corresponding index child table.

13. The apparatus as claimed in claim 12, wherein,

the first writing unit comprises,

a first determining module configured to determine a user child table according to the user data; and

a first writing module configured to write the user data in the user child table;

the second writing unit comprises,

a second writing module configured to write the index data into an index child table corresponding to the user child table.

14. The apparatus as claimed in claim 13, wherein,

the generating unit comprise,

a second determining module configured to determine a range key of the index data and a specific key of the index data, wherein the range key of the index data comprises a range key of the user data or a start key of the user child table or an end key of the user child table, the specific key of the index data comprises the value part of the user data and the key part of the user data;

a generating module configured to generate index data according to the range key of the index data and the specific key of the index data.

15. The apparatus as claimed in claim 14, wherein,

the second determining module configured to extract the range key of the user data the start key of the user child table or the end key of the user child table.

16. The apparatus as claimed in claim 14, wherein,

the value part of the user data adopted in the index data is related to a value of the index column.

17. The apparatus as claimed in claim 16, wherein,

the index data further comprises a name of an index column in the user child table.

18. The apparatus as claimed in claim 17, wherein,

the name of the index column, the value of the index column and the key part of the user data being sequentially arranged in the index data.

19. A distributed storage system, comprising,

at least one region server, each of the region server storing at least one user table and at least one index table corresponding to the at least one user table, wherein, the range key of each of the user table is identical to that of index table corresponding to the user table; secondary index establishing apparatus as claimed in any one of claims lO- 18.