CN111339088A

CN111339088A - Database division and table division method, device, medium and computer equipment

Info

Publication number: CN111339088A
Application number: CN202010107050.5A
Authority: CN
Inventors: 徐雄飞; 储存; 张超; 万全伟
Original assignee: Suning Cloud Computing Co Ltd
Current assignee: Suning Cloud Computing Co Ltd
Priority date: 2020-02-21
Filing date: 2020-02-21
Publication date: 2020-06-26

Abstract

The application relates to a database and table dividing method. The method comprises the following steps: acquiring target data to be stored, and determining a first hash value of the target data; matching the first hash value with a reference hash value of each virtual node in a preset storage unit to obtain a second hash value matched with the first hash value, wherein the preset storage unit stores a mapping relation between the reference hash value and a node character string of the virtual node; acquiring a node character string corresponding to the second hash value according to the mapping relation, wherein the node character string comprises identification information of a physical sub-library sub-table to which a virtual node corresponding to the node character string belongs; and extracting identification information in the node character string, and storing the target data into a physical sub-library sub-table corresponding to the identification information. According to the method and the device, the data can be distributed evenly after the database is divided into tables, and the database is not limited by the data content in the preset database when the database is divided into tables.

Description

Database division and table division method, device, medium and computer equipment

Technical Field

The present application relates to the field of data processing technologies, and in particular, to a method, an apparatus, a medium, and a computer device for database sorting and table sorting.

Background

The database partitioning and table partitioning is to divide a preset database into a plurality of parts and place the parts on different preset databases, so that the performance problem of a single preset database is solved, the existing data in the preset database can be split, and the data which is not stored in the database can be distributed and stored.

Under the condition of a certain number of libraries, the conventional library and table dividing method generally comprises the following steps:

firstly, calculating Hash value by adopting a pair-bank and sub-table factor, and performing modulus division, bank division and sub-table division

The method comprises the steps of calculating the Hash value of the sub-library and sub-table factors, calculating a 32-bit or even more integer value, and then performing modulo operation on the library number and the table number to obtain the number of the library and perform data library and table dropping. The scheme is simple in calculation and high in performance, however, if the database and table dividing factors are not increased according to the Hash value, the database and table dividing data distribution is very unbalanced, particularly the distribution is extremely unbalanced under the condition of limited database and table dividing factors, and even the situation that some database tables are not distributed to related data occurs.

Secondly, calculating the time interval database and table by adopting database and table dividing factors

The method comprises the steps of extracting time values contained in sub-database and sub-table factors, and then carrying out relevant matching on time rules specified by a preset database according to sub-database and sub-table to achieve database falling and table falling of data according to the time rules; the scheme is simple in calculation and high in performance, however, if time-related information is not contained in the service data of the sub-base and sub-table, the scheme cannot be used. In addition, the method needs to plan the library corresponding to the time interval in advance for the fixed library, and if the service data volume is continuously increased along with the time, the data volume borne by the library table at the later time is more and more, which causes the data distribution to be unbalanced.

And thirdly, a database-dividing and table-dividing mode combining the two modes, namely calculating the Hash value of the database-dividing and table-dividing factor, performing modulo database division, and extracting and calculating the time value contained in the database-dividing and table-dividing factor. The scheme dynamically increases the table with time under the condition that the database is fixed, however, the scheme also inherits the defects of the two schemes, namely unbalanced data distribution is caused and the application range is limited by the time-related information of the service data in the preset database.

Disclosure of Invention

Based on this, it is necessary to provide a database partitioning method, apparatus, computer device and storage medium, which can achieve balanced distribution of data after partitioning and table division, and is not limited by the data content in the preset database when partitioning and table division.

A database division and table division method of a database comprises the following steps:

acquiring target data to be stored, and determining a first hash value of the target data;

matching the first hash value with a reference hash value of each virtual node in a preset storage unit to obtain a second hash value matched with the first hash value, wherein the preset storage unit stores a mapping relation between the reference hash value and a node character string of the virtual node;

acquiring a node character string corresponding to the second hash value according to the mapping relation, wherein the node character string comprises identification information of a physical sub-base sub-table to which a virtual node corresponding to the node character string belongs;

and extracting identification information in the node character string, and storing the target data into a physical sub-library sub-table corresponding to the identification information.

In one embodiment, before inputting the target data to be distributed, the method further comprises:

obtaining base table information of at least one physical sub-base sub-table, wherein the base table information comprises the number of sub-bases in the at least one physical sub-base sub-table and the number of sub-tables in each sub-base;

determining the number of physical sub-databases and sub-tables according to the database table information;

determining the number of virtual nodes according to the number of physical sub-base sub-tables;

setting identification information for each physical sub-warehouse according to the warehouse table information and a first preset rule;

respectively generating node character strings containing the identification information according to the identification information of each physical sub-library and sub-table and a second preset rule;

determining a reference hash value of each node character string according to a preset hash algorithm;

creating a mapping relation between each reference hash value and a corresponding node character string;

sequencing the reference hash values according to a preset sequencing rule to obtain a sequenced mapping relation queue;

and storing the mapping relation queue into a preset storage unit.

In one embodiment, matching the first hash value with a reference hash value of a virtual node in a preset storage unit, and obtaining a second hash value matched with the first hash value includes:

constructing an annular distribution data group according to a mapping relation queue in a preset storage unit;

and selecting a reference hash value which is larger than the first hash value and has the smallest difference value with the first hash value from the annular distribution data group according to a preset hour direction as a second hash value matched with the first hash value.

In one embodiment, the step of extracting the identification information in the character string of the node and storing the target data into the physical sublibrary table corresponding to the identification information includes:

extracting first identification information and second identification information in the identification information;

determining sub-libraries in the physical sub-library sub-table according to the first identification information;

determining sub-tables in the physical sub-database sub-tables according to the second identification information;

and storing the target data into the branch table in the branch base.

In one embodiment, the method further comprises:

receiving a request for newly adding a physical sub-library and a sub-table;

extracting base table information of the newly added physical sub-base sub-table in the request;

determining the node character strings of each virtual node corresponding to the newly added physical sub-base sub-table according to the base table information;

calculating a newly added reference hash value of each node character string;

creating a mapping relation between the newly added reference hash value and the corresponding node character string;

sorting the newly added reference hash value and the original reference hash value in the preset storage unit according to a preset sorting rule to generate a sorted new mapping relation queue;

and storing the new mapping relation queue into a preset storage unit.

In one embodiment, the method further comprises:

extracting newly-added base table information in a request of newly-added physical sub-base sub-tables;

determining the number of the newly added physical sub-base sub-tables according to the information of the newly added base table;

when the number of the newly added physical sub-databases is larger than a preset threshold value, determining the number of the physical sub-databases according to the information of the newly added base tables and the base table information of the original physical sub-databases;

and re-executing the step of determining the number of the virtual nodes according to the number of the physical sub-base sub-tables.

In one embodiment, determining the number of virtual nodes according to the number of physical sub-base sub-tables includes:

setting a plurality of data to be tested according to the number of the physical sub-base sub-tables and selecting a preset number of test samples;

respectively constructing corresponding test samples by taking each data to be tested as the number of virtual nodes and taking a preset number of test samples as target data to be stored;

respectively testing each test sample in a sub-library and sub-table manner to obtain distribution data of the test samples in each physical sub-library and sub-table manner under each test sample;

obtaining a sample distribution difference value in each test sample according to the distribution data, wherein the sample distribution difference value is the difference between the highest value of the distribution data and the lowest value of the distribution data in each test sample;

and selecting target test data from the test data according to the sample distribution difference value to serve as the number of the virtual nodes.

A database sorting and table dividing device comprises:

the input module is used for acquiring target data to be stored and determining a first hash value of the target data;

the matching module is used for matching the first hash value with the reference hash value of each virtual node in the preset storage unit to obtain a second hash value matched with the first hash value, and the preset storage unit stores the mapping relation between the reference hash value and the node character string of the virtual node;

the acquisition module is used for acquiring a node character string corresponding to the second hash value according to the mapping relation, wherein the node character string comprises identification information of a physical sub-library sub-table to which a virtual node corresponding to the node character string belongs;

and the storage module is used for extracting the identification information in the character string of the node and storing the target data into the physical sub-library sub-table corresponding to the identification information.

A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of any of the above-described embodiments of the method are performed by the processor when the computer program is executed by the processor.

A computer-readable storage medium, on which a computer program is stored, characterized in that the computer program realizes the steps of the method of any of the above embodiments when executed by a processor.

According to the database partitioning method, the database partitioning device and the computer equipment, the first hash value of target data is determined by acquiring the target data to be stored; matching the first hash value with a reference hash value of each virtual node in a preset storage unit to obtain a second hash value matched with the first hash value, wherein the preset storage unit stores a mapping relation between the reference hash value and a node character string of the virtual node; acquiring a node character string corresponding to the second hash value according to the mapping relation, wherein the node character string comprises identification information of a physical sub-base sub-table to which a virtual node corresponding to the node character string belongs; and extracting identification information in the node character string, and storing the target data into a physical sub-library sub-table corresponding to the identification information. According to the method and the device, the data can be distributed evenly after the database is divided into tables, and the database is not limited by the data content in the preset database when the database is divided into tables.

Drawings

FIG. 1 is a diagram illustrating an application environment of a database partitioning method according to an exemplary embodiment of the present application;

FIG. 2 is a schematic flow chart illustrating a database-partitioning and table-partitioning method of a database according to an exemplary embodiment of the present application;

fig. 3 is a schematic flowchart of a database sub-table method according to an exemplary embodiment of the present application before inputting physical sub-table elements to be allocated;

fig. 4 is a schematic structural diagram of a hash ring provided in an exemplary embodiment of the present application;

FIG. 5 is a flowchart illustrating a process for determining the number of virtual nodes according to the base table information in an exemplary embodiment of the present application;

fig. 6 is a block diagram of a structure of a database sub-table apparatus provided in an exemplary embodiment of the present application;

fig. 7 is a block diagram illustrating a structure of a database sub-table apparatus according to an exemplary embodiment of the present application;

fig. 8 is an internal structural diagram of a computer device provided in an exemplary embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.

Referring to fig. 1, fig. 1 is a schematic application environment diagram of a database sub-table method according to an exemplary embodiment of the present application. As shown in fig. 1, the database-based sub-table system includes a server 100 and a terminal 101, where the server 100 and the terminal 101 communicate through a network 102 to implement the database-based sub-table method of the present application.

The server 100 is configured to, when receiving a data storage request submitted by the terminal 101, obtain target data to be stored, and determine a first hash value of the target data; matching the first hash value with a reference hash value of each virtual node in a preset storage unit to obtain a second hash value matched with the first hash value, wherein the preset storage unit stores a mapping relation between the reference hash value and a node character string of the virtual node; acquiring a node character string corresponding to the second hash value according to the mapping relation, wherein the node character string comprises identification information of a physical sub-base sub-table to which a virtual node corresponding to the node character string belongs; and extracting the identification information in the character string of the node, storing the target data into a physical sub-base sub-table corresponding to the identification information, and sending feedback information of data storage completion to the terminal 101. The server 100 may be implemented as a stand-alone server or a server cluster composed of a plurality of servers.

The terminal 101 is configured to send a data storage request to the server 100, and receive feedback information of data storage completion fed back after the server stores data. The terminal 101 may be, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices.

The network 102 is used to realize network connection between the data processing server 100 and the terminal 101. In particular, the network 102 may include various types of wired or wireless networks.

In one embodiment, as shown in fig. 2, a database and table dividing method is provided, which is described by taking the method as an example applied to the server in fig. 1, and includes the following steps:

and S11, acquiring target data to be stored, and determining a first hash value of the target data.

In one possible design, a terminal submits a data storage request containing target data to be stored to a server, the server obtains the target data to be stored in the data storage request, and a first hash value of the target data is calculated according to a preset hash algorithm.

Specifically, the first HASH value of the target data is calculated by using a KETAMA _ HASH algorithm (consistent HASH algorithm).

And S12, matching the first hash value with the reference hash value of each virtual node in the preset storage unit to obtain a second hash value matched with the first hash value, wherein the preset storage unit stores the mapping relation between the reference hash value and the node character string of the virtual node.

The method comprises the steps of obtaining at least one physical sub-library sub-table, distributing a preset number of virtual nodes to each physical sub-library sub-table according to the number of the obtained physical sub-library sub-tables, and establishing association between each virtual node and the corresponding physical sub-library sub-table in advance. Namely, each physical sub-base sub-table is correspondingly provided with a preset number of virtual nodes. When target data need to be stored, determining a virtual node matched with the target data, and storing the target data into a corresponding physical sub-library sub-table according to the relevance between the virtual node and the physical sub-library sub-table. The target data to be stored is mapped on the virtual node, which means that the real storage position of the target data is on the physical sublibrary table to which the virtual node belongs. In the application, the target data can be uniformly distributed on each physical sub-base sub-table through the setting of the virtual nodes.

Further, after the number of the virtual nodes and the node character strings of each virtual node are set, the reference hash values of the node character strings are calculated, further, the mapping relation between each reference hash value and the node character string is established, and the mapping relation is stored in the preset storage unit.

Specifically, the preset storage unit includes a plurality of data records of mapping relationships, and each data record includes a reference hash value and a node character string corresponding to the reference hash value.

Further, the first hash value is matched with each reference hash value, when the matching is successful, the reference hash value which is successfully matched is obtained, and the reference hash value is used as the second hash value.

And S13, acquiring a node character string corresponding to the second hash value according to the mapping relation, wherein the node character string comprises identification information of a physical sub-library sub-table to which a virtual node corresponding to the node character string belongs.

In the present application, a physical sublibrary table is a physical node associated with a virtual node, and is allocated to target data of each virtual node, and its substantial storage location is on the physical node associated with the virtual node, that is, in the physical sublibrary table.

The method and the device have the advantages that the identification information of each physical sub-database sub-table is preset and used for uniquely determining the physical node corresponding to one physical sub-database sub-table.

When the node character string of each virtual node is set, the identification information of each physical sub-base sub-table is contained in each node character string, so that each virtual node and each physical sub-base sub-table are associated.

And S14, extracting the identification information in the node character string, and storing the target data into a physical sub-library sub-table corresponding to the identification information.

The identification information of the physical sub-database sub-table comprises first identification information of the sub-database and second identification information of the sub-table. A physical sub-database table can be uniquely determined through the first identification information and the second identification information, and target data can be conveniently stored in the physical sub-database table.

Referring to fig. 3, in an embodiment, before the inputting the physical sub-library and sub-table elements to be allocated, the method may further include:

s101, obtaining base table information of at least one physical sub-base sub-table, wherein the base table information comprises the number of sub-bases in the at least one physical sub-base sub-table and the number of sub-tables in each sub-base.

And S102, determining the number of the physical sub-databases and the sub-tables according to the base table information.

S103, determining the number of the virtual nodes according to the number of the physical sub-base sub-tables.

And S104, setting identification information for each physical sub-database sub-table according to the database table information and a first preset rule.

And S105, respectively generating node character strings containing the identification information according to the identification information of each physical sub-base sub-table and a second preset rule.

And S106, determining the reference hash value of each node character string according to a preset hash algorithm.

And S107, creating a mapping relation between each reference hash value and the corresponding node character string.

And S108, sequencing the reference hash values according to a preset sequencing rule to obtain a sequenced mapping relation queue.

And S109, storing the mapping relation queue into a preset storage unit.

In one embodiment, the method obtains base table information of one physical sub-base sub-table, namely one physical node, wherein the base table information comprises the number of sub-bases in the physical sub-base sub-table and the number of sub-tables in the sub-bases, the number of sub-bases corresponding to one physical sub-base sub-table is 1, and the number of sub-tables is also 1.

In another embodiment, the method obtains base table information of a plurality of physical sub-base sub-tables. Further, the total number of the input physical sub-base sub-tables is calculated according to the base table information. Specifically, the total number of the physical sub-library sub-tables is the number of sub-libraries in the physical sub-library sub-table and the number of sub-tables in each sub-library.

For example, in one possible application scenario, the obtained physical sub-library sub-tables include two sub-libraries, each sub-library includes two sub-tables, and then the total number of the physical sub-library sub-tables is 2 × 2 — 4.

The method and the device have the advantage that the identification information of each physical sub-library and sub-table is preset. In a possible design, the method and the device can set identification information for each physical sub-database sub-table according to a preset first preset rule and base table information, and further set the node character string of each virtual node according to the identification information of each physical sub-database sub-table.

In one possible design, the first preset rule may be:

"db"+String.format("％02d",i)+"tb"+String.format("％02d,j)

where i represents the number of actual bins, j represents the number of actual bins, and% 02d represents less than two pre-padding by 0.

Continuing with the application scenario as an example, the four physical sub-database sub-tables are A, B, C and D, respectively, and the identification information may be set for the 4 physical sub-database sub-tables according to the first preset rule, which is as follows:

the identification information of the physical node a is: db00tb00

The identification information of the physical node B is: db00tb00

The identification information of the physical node C is: db00tb01

The identification information of the physical node D is: db00tb01

Further, the node character strings of the virtual nodes are set according to the identification information of the physical sub-base sub-tables. When the node character strings of each virtual node are set by the identification information of each physical sub-base sub-table, the identification information can be used as a prefix or a suffix, and specific rule contents can be designed by self.

Assuming that 3 virtual nodes are allocated to each physical sublibrary table, a total of 12 virtual nodes can be formed by 4 physical sublibrary tables, which are respectively: M1-M12, the node strings of each virtual node can be set as follows:

node string for virtual node M1: db00tb00-vn0

Node string for virtual node M2: db00tb00-vn1

Node string for virtual node M3: db00tb00-vn2

Node string for virtual node M4: db00tb01-vn0

Node string for virtual node M5: db00tb01-vn1

Node string for virtual node M6: db00tb01-vn2

Node string for virtual node M7: db01tb00-vn0

Node string for virtual node M8: db01tb00-vn1

Node string for virtual node M9: db01tb00-vn2

Node string for virtual node M10: db01tb01-vn0

Node string for virtual node M11: db01tb01-vn1

Node string for virtual node M12: db01tb01-vn2

Wherein the virtual nodes M1-M3 are associated with the physical sub-base table A; the virtual nodes M4-M6 are associated with a physical sublibrary B; the virtual nodes M7-M9 are associated with a physical sublibrary C; virtual nodes M10-M12 are associated with physical sublibrary D. Thus, the actual storage locations of the target data distributed across virtual nodes M1-M3 are physical node A.

Further, reference hash values of the virtual nodes are calculated, mapping relations between the reference hash values and the node character strings are established, sorting is carried out according to the reference hash values, a sorted mapping relation queue is generated, and the mapping relation queue is stored in a preset storage unit. In one possible design, the mapping relation queue in the preset storage unit is shown in table 1 below.

Table 1 in one embodiment, a mapping relationship queue in a storage unit is preset

As shown in table 1 above, the preset storage unit includes 12 data records of mapping relationships, and each data record includes a mapping relationship between a reference hash value of a virtual node and a node character string of the virtual node. In addition, in the preset storage unit, the reference hash value is sorted according to a preset sorting rule, and if two data records at the head and the tail in the preset storage unit are connected according to a rule from small to large or from large to small, an annular distribution data group, namely a hash ring, is formed.

In an embodiment, the matching the first hash value with the reference hash value of the virtual node in the preset storage unit to obtain the second hash value matching the first hash value may include:

In this application, constructing the circular distribution data group according to the mapping relationship queue in the preset storage unit may include the following steps:

previously constructing a length of 2³²The hash ring of (1);

and marking each reference hash value and the corresponding node character string on a hash ring according to each reference hash value in a preset storage unit to obtain an annular distribution data set.

Referring to fig. 4, fig. 4 is a schematic diagram illustrating a structure of a hash ring according to an embodiment. In fig. 4, 12 virtual nodes, reference hash values corresponding to the virtual nodes, and corresponding field strings are marked on the hash ring.

As shown in fig. 4, taking the input target data as "dutdwindqlogcompensationjob" as an example, the step of calculating the hash value of the target data is as follows:

acquiring a string value of target data: DUTSWindQLogCompsationJob;

the obtained string value is subjected to MD5 encryption to obtain an MD5 byte array as follows:

10001010, 10001001, 10111001, 11010011, 01011111, 01000001, 01011001, 00010101, 01010000, 01100111, 10001010, 10011000, 01010101, 11010000, 10101100, and 11010101

Taking the last 4 bytes yields: 01010101, 11010000, 10101100 and 11010101

The bitwise AND operation value is as follows: 01010101, 11010000, 10101100 and 11010101

The 0 th byte is left shifted by 0 bits to get: 11010101

The 1 st byte is shifted left by 8 bits to get: 1010110000000000

Byte 2 is shifted left by 16 bits to yield: 110100000000000000000000

Byte 3 is shifted left by 24 bits to yield: 01010101000000000000000000000000

Shifting and then performing length conversion and shaping to obtain: 00000000000000000000000011010101

Shifting and then performing length conversion and shaping to obtain: 00000000000000001010110000000000

Shifting and then performing length conversion and shaping to obtain: 00000000110100000000000000000000

Shifting and then performing length conversion and shaping to obtain: 01010101000000000000000000000000

The bitwise OR value yields: 01010101110100001010110011010101

Conversion to binary yields: 01010101110100001010110011010101

Conversion to growth-shaping values gives: 1439739093

As shown in FIG. 4, the Hash value 1439739093 of the target data DUTSWindQLogCompsensionJob is between db00tb00-vn2(1124442075) and db01tb00-vn2(1903905256), the target data is routed to db01tb00-vn2 virtual node according to the rule of forward routing on the Hash ring by the consistent Hash algorithm, the identification information is extracted from the node character string db01tb00-vn2 of the virtual node to obtain "db 01tb 00" so as to obtain the physical library table as db01 library tb00 table, and the value of the table is db01tb00, namely db01 library tb00 table.

In one embodiment, the extracting the identification information in the character string of the node and storing the target data in the physical sublibrary corresponding to the identification information includes:

and storing the target data into the branch table in the branch base.

In one embodiment, the method may further include:

receiving a request for newly adding a physical sub-library and a sub-table;

calculating a newly added reference hash value of each node character string;

and storing the new mapping relation queue into a preset storage unit.

In one embodiment, the method may further include:

In the application, when the request of the newly added physical sub-database sub-tables includes a plurality of newly added physical sub-database sub-tables, whether the number of the virtual nodes needs to be reset is determined according to the number of the newly added physical sub-database sub-tables. Since the number of the virtual nodes required is smaller when the number of the physical sub-library sub-tables is larger, the number of the acquired physical sub-library sub-tables affects the number of the set virtual nodes. Therefore, when the number of the newly added physical sub-base sub-tables is greater than the preset number, the number of the virtual nodes needs to be determined again.

Referring to fig. 5, in an embodiment, the determining the number of virtual nodes according to the base table information may further include:

s151, setting a plurality of data to be tested according to the number of the physical sub-base sub-tables and selecting a preset number of test samples.

S152, respectively constructing corresponding test samples by taking each data to be tested as the number of virtual nodes and taking a preset number of test samples as target data to be stored.

And S153, performing sub-library and sub-table testing on each test sample respectively to obtain distribution data of the test samples in each physical sub-library and sub-table under each test sample.

And S154, obtaining a sample distribution difference value in each test sample according to the distribution data, wherein the sample distribution difference value is the difference between the highest value of the distribution data and the lowest value of the distribution data in each test sample.

And S155, selecting target test data from the test data according to the sample distribution difference value to serve as the number of the virtual nodes.

In the present application, the number of virtual nodes needs to be determined according to a plurality of factors. Specifically, the number of the virtual nodes can be determined according to the number of the physical sub-base sub-tables, the time consumption of data query and the balance condition of sample distribution.

In one possible design, the following scheme is adopted to determine the number of virtual nodes:

the method comprises the steps of setting identification information of each physical sub-database sub-table according to base table information of pre-acquired base table information of the physical sub-database sub-tables;

further, constructing a test sample and test sample information according to the base table information of the physical sub-base sub-tables, wherein the test sample information comprises a preset number of test samples and a plurality of to-be-tested data, the test samples serve as target data to be stored, the to-be-tested data serve as the number of virtual nodes in the test sample, and the sub-base sub-tables are performed on the test samples so as to distribute the test samples to the physical sub-base sub-tables;

the method comprises the steps of obtaining a current test sample, calculating hash values of all test samples in the current test sample, determining node character strings of virtual nodes matched with all the hash values according to all the hash values, determining identification information of corresponding physical sub-base sub-tables according to all the node character strings, distributing all the test samples to the physical sub-base sub-tables corresponding to the identification information according to the identification information, counting the number of the test samples distributed in all the physical sub-base sub-tables in the current test sample, and further calculating a sample distribution difference value in the current test sample.

And sequentially testing each test sample to obtain the quantity of the test samples distributed in each physical sub-base sub-table in each test sample and a sample distribution difference value. And selecting the test sample with the minimum sample distribution difference value, and extracting the data to be tested in the test sample as the number of the virtual nodes.

Continuing to take the above 2 sub-banks, each sub-bank includes an example of 2 sub-tables as an example, assuming that the preset number of samples is 738 samples, determining the data to be tested according to the base table information as follows: 4. 12, 20, 40, 400, 4000, 40000 and 400000, and performing library and table division on each test sample by taking each data to be tested as the number of virtual nodes in one test sample. Each sample is tested and distributed to 4 corresponding physical sub-base sub-tables, and the obtained sample distribution result is shown in the following table 2:

table 2 in an embodiment, a sample distribution result table of the preset test samples in the physical sub-library sub-tables

As shown in table 2 above, table 2 includes 8 test samples, each test sample includes a preset number of virtual nodes, that is, data to be tested, and also includes a sample distribution number and a sample distribution difference on 4 physical sub-base sub-tables. Each test sample is a preset 738 samples, and the 738 samples are allocated to the corresponding physical sub-base sub-tables. As can be seen from table 2, if the test sample with the smallest sample distribution difference is the 7 th test sample, and the number of virtual nodes in the test sample is 40000, the following results are obtained according to the test result: when the input base table information is 2 bases and each database includes 2 tables, the total number of virtual nodes may be set to 40000.

In one embodiment, as shown in fig. 6, there is provided a database sub-table apparatus, including:

the input module 11 is configured to acquire target data to be stored, and determine a first hash value of the target data;

the matching module 12 is configured to match the first hash value with a reference hash value of each virtual node in a preset storage unit, and obtain a second hash value matched with the first hash value, where the preset storage unit stores a mapping relationship between the reference hash value and a node character string of the virtual node;

the obtaining module 13 is configured to obtain a node character string corresponding to the second hash value according to the mapping relationship, where the node character string includes identification information of a physical sub-library sub-table to which a virtual node corresponding to the node character string belongs;

and the storage module 14 is used for extracting the identification information in the character string of the node and storing the target data into the physical sub-library sub-table corresponding to the identification information.

Referring to fig. 7, in one embodiment, the apparatus further includes:

the system comprises a presetting module 10, a database management module and a database management module, wherein the presetting module is used for acquiring database table information of at least one physical sub-database sub-table, and the database table information comprises the number of sub-databases in the at least one physical sub-database sub-table and the number of sub-tables in each sub-database;

and storing the mapping relation queue into a preset storage unit.

In one embodiment, the matching module 12 includes:

the matching unit is used for constructing an annular distribution data group according to the mapping relation queue in the preset storage unit;

In one embodiment, the identification information includes first identification information of a branch in a physical branch table and second identification information of a branch in a physical branch table, and the storage module 14 includes:

and storing the target data into the branch table in the branch base.

Referring to fig. 7, in one embodiment, the apparatus further includes:

the newly-added module 15 is used for receiving a request of newly-added physical sub-warehouse sub-tables;

calculating a newly added reference hash value of each node character string;

and storing the new mapping relation queue into a preset storage unit.

In one embodiment, the adding module 16 further includes:

the newly-added unit is used for extracting newly-added base table information in the request of newly-added physical sub-base sub-tables;

In one embodiment, the presetting module 15 includes:

the device comprises a presetting unit, a data acquisition unit and a data analysis unit, wherein the presetting unit is used for setting a plurality of data to be tested according to the number of physical sub-base sub-tables and selecting a preset number of test samples;

In one embodiment, a computer device is provided, which may be a service processing server, and its internal structure diagram may be as shown in fig. 8. The computer device includes a processor, a memory, a network interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide the determining and controlling capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The network interface of the computer device is used for communicating with an external first terminal through a network connection. The computer program is executed by a processor to implement a database-partitioning method. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.

In one embodiment, a computer device is provided, comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the following steps when executing the computer program: acquiring target data to be stored, and determining a first hash value of the target data; matching the first hash value with a reference hash value of each virtual node in a preset storage unit to obtain a second hash value matched with the first hash value, wherein the preset storage unit stores a mapping relation between the reference hash value and a node character string of the virtual node; acquiring a node character string corresponding to the second hash value according to the mapping relation, wherein the node character string comprises identification information of a physical sub-base sub-table to which a virtual node corresponding to the node character string belongs; and extracting the identification information in the character string of the node, and storing the target data into a physical sub-library sub-table corresponding to the identification information.

In one embodiment, the processor when executing the computer program further specifically implements the following steps:

and storing the mapping relation queue into a preset storage unit.

In an embodiment, when the processor executes the computer program to implement the step of matching the first hash value with the reference hash value of the virtual node in the preset storage unit and acquiring the second hash value matched with the first hash value, the following steps are specifically implemented:

In an embodiment, the identifier information includes first identifier information of a sub-table in the physical sub-table and second identifier information of a sub-table in the physical sub-table, and the processor executes the computer program to implement the steps of extracting the identifier information in the character string of the node and storing the target data in the physical sub-table corresponding to the identifier information, and specifically implements the following steps:

and storing the target data into the branch table in the branch base.

In one embodiment, when the processor executes the computer program, the following steps are specifically implemented:

receiving a request for newly adding a physical sub-library and a sub-table;

calculating a newly added reference hash value of each node character string;

and storing the new mapping relation queue into a preset storage unit.

In an embodiment, when the processor executes the computer program to implement the step of determining the number of virtual nodes according to the number of physical sub-library sub-tables, the following steps are specifically implemented:

In one embodiment, a computer-readable storage medium is provided, having a computer program stored thereon, which when executed by a processor, performs the steps of: acquiring target data to be stored, and determining a first hash value of the target data; matching the first hash value with a reference hash value of each virtual node in a preset storage unit to obtain a second hash value matched with the first hash value, wherein the preset storage unit stores a mapping relation between the reference hash value and a node character string of the virtual node; acquiring a node character string corresponding to the second hash value according to the mapping relation, wherein the node character string comprises identification information of a physical sub-base sub-table to which a virtual node corresponding to the node character string belongs; and extracting the identification information in the character string of the node, and storing the target data into a physical sub-library sub-table corresponding to the identification information.

In one embodiment, the computer program when executed by the processor further embodies the steps of:

and storing the mapping relation queue into a preset storage unit.

In an embodiment, when the computer program is executed by the processor to implement the step of matching the first hash value with the reference hash value of the virtual node in the preset storage unit and acquiring the second hash value matched with the first hash value, the following steps are specifically implemented:

In an embodiment, the identification information includes first identification information of a sub-table in the physical sub-table and second identification information of a sub-table in the physical sub-table, and when the computer program is executed by the processor to implement the steps of extracting the identification information in the node character string and storing the target data in the physical sub-table corresponding to the identification information, the following steps are specifically implemented:

and storing the target data into the branch table in the branch base.

In one embodiment, the computer program, when executed by the processor, further embodies the steps of:

receiving a request for newly adding a physical sub-library and a sub-table;

calculating a newly added reference hash value of each node character string;

and storing the new mapping relation queue into a preset storage unit.

In an embodiment, when the computer program is executed by the processor to implement the step of determining the number of virtual nodes according to the number of physical sub-base sub-tables, the following steps are specifically implemented:

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware related to instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, physical sub-tables, or other media used in the embodiments provided herein may include non-volatile and/or volatile memory. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).

The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.

The above examples only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims

1. A database and table dividing method for a database comprises the following steps:

acquiring a node character string corresponding to the second hash value according to the mapping relation, wherein the node character string comprises identification information of a physical sub-library sub-table to which a virtual node corresponding to the node character string belongs;

2. The method of claim 1, wherein prior to the inputting the target data to be distributed, the method further comprises:

determining the number of the physical sub-databases and sub-tables according to the base table information;

determining the number of the virtual nodes according to the number of the physical sub-base sub-tables;

setting identification information for each physical sub-warehouse sub-table according to the base table information and a first preset rule;

and storing the mapping relation queue into the preset storage unit.

3. The method according to claim 1, wherein the matching the first hash value with a reference hash value of a virtual node in a preset storage unit to obtain a second hash value matching the first hash value comprises:

constructing an annular distribution data group according to the mapping relation queue in the preset storage unit;

4. The method according to claim 1, wherein the identification information includes first identification information of a branch in the physical branch table and second identification information of a branch in the physical branch table, and the extracting the identification information in the string of nodes and storing the target data in the physical branch table corresponding to the identification information includes:

determining the sub-libraries in the physical sub-library sub-table according to the first identification information;

and storing the target data into the sublist in the sublist.

5. The method of claim 2, further comprising:

receiving a request for newly adding a physical sub-library and a sub-table;

calculating a newly added reference hash value of each node character string;

sorting the newly added reference hash value and the original reference hash value in the preset storage unit according to the preset sorting rule to generate a sorted new mapping relation queue;

and storing the new mapping relation queue into the preset storage unit.

6. The method of claim 5, further comprising:

extracting newly-added base table information in the request of the newly-added physical sub-base sub-table;

determining the number of the newly added physical sub-database tables according to the information of the newly added base table;

when the number of the newly added physical sub-database sub-tables is larger than a preset threshold value, determining the number of the physical sub-database sub-tables according to the information of the newly added base tables and the base table information of the original physical sub-database sub-tables;

7. The method of claim 2, wherein determining the number of virtual nodes according to the number of physical sub-base sub-tables comprises:

respectively constructing corresponding test samples by taking each data to be tested as the number of the virtual nodes and taking a preset number of test samples as target data to be stored;

8. A database and table partitioning apparatus for a database, the apparatus comprising:

the matching module is used for matching the first hash value with a reference hash value of each virtual node in a preset storage unit to obtain a second hash value matched with the first hash value, and the preset storage unit stores a mapping relation between the reference hash value and a node character string of the virtual node;

an obtaining module, configured to obtain a node character string corresponding to the second hash value according to the mapping relationship, where the node character string includes identification information of a physical sub-library sub-table to which a virtual node corresponding to the node character string belongs;

and the storage module is used for extracting the identification information in the character string of the node and storing the target data into a physical sub-library sub-table corresponding to the identification information.

9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the steps of the database sub-division method according to any one of claims 1 to 7 when executing the computer program.

10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method for partitioning a database according to any one of claims 1 to 7.