CN115563123A - Database and table dividing method for gene injection partition keys - Google Patents
Database and table dividing method for gene injection partition keys Download PDFInfo
- Publication number
- CN115563123A CN115563123A CN202211354004.0A CN202211354004A CN115563123A CN 115563123 A CN115563123 A CN 115563123A CN 202211354004 A CN202211354004 A CN 202211354004A CN 115563123 A CN115563123 A CN 115563123A
- Authority
- CN
- China
- Prior art keywords
- partition
- sub
- scheduling
- partition key
- keys
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2228—Indexing structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2282—Tablespace storage structures; Management thereof
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/27—Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
Abstract
The invention relates to a database and table dividing method for gene injection partition keys, which comprises the following steps: when the registration of new gene information is detected, judging whether a database example needs to be created for the new gene information according to a preset gene information configuration strategy; when detecting that the data volume in any service table in the database example reaches the data volume limited in the preset table dividing strategy during the automatic table dividing strategy scanning, dividing the service table; and dividing a sub-table with a plurality of gene information into a plurality of partition keys, wherein each partition key has a partition key of a corresponding category, and the partition key keys generated by the gene information are respectively distributed to different partition keys according to different gene information. The sub-warehouse and sub-table can be realized according to multiple dimensions, the accurate operation and maintenance of the application system can be realized, the operation and maintenance workload of the application system can be saved to a great extent, the use safety of the sub-table is improved, the accurate operation and maintenance is realized through the sub-warehouse, the operation and maintenance workload is saved, and the operation and maintenance labor intensity is reduced.
Description
Technical Field
The invention relates to the technical field of data processing, in particular to a database and table dividing method for gene injection partition keys.
Background
After the data volume of the gene information is accumulated to a certain degree, the problem of data processing bottleneck of an application system can be solved. The method mainly solves the problems of large single-table data volume, wide service bearing range and complex calculation, and can relieve the problems through database splitting. The existing partial library and partial table method mostly adopts the principle of average distribution, and has the defects of low data distribution rule degree, scattered associated data distribution, need of global search when checking and modifying partial data and the like.
Many applications use the generated key for new data, where the key also serves as a partition key, and uses range partitions to partition data. Range partitioning a partition is selected by determining whether a partition key is within a range of values. A common method of key generation is to use monotonically increasing keys by increasing successive key values, which has the advantage of keeping the data of the key cluster generated organized in order, and also the advantage of allocating space for new data at the end of a partition, which is usually much more efficient than inserting in the middle of existing data. One drawback of this approach is that data cannot be immediately propagated across partitions. Another disadvantage of using monotonically increasing keys is that when concurrent accesses occur in a multi-threaded application environment, a "hot spot" is formed in the database, both during the initial insertion of new data and during subsequent accesses of new data. However, conflicts on "hot spots" often cause contention between threads due to locking in local areas of the database and serialization in space allocation. In addition, if multiple data records are inserted for each generated key, a multi-threaded application may not achieve the full advantage of maintaining the order of the inserted data because two competing threads with sequentially generated keys may interleave their data. When the underlying database management system uses page-level locking, interleaving data on the same page can cause deadlock between application threads that are concurrently updating other unrelated records.
One method commonly used to overcome the above disadvantages is to invert the bytes of the monotonically increasing key, which can have the effect of spreading new data evenly and continuously throughout the key range, placing it in various partitions in a round-robin fashion. Hot spots are avoided because each subsequent insertion is at another point in the database. However, this approach does not maintain a good organization of the data, and may require more frequent data reorganization; moreover, the insertion is not done at the end of the pre-existing data, thus sacrificing efficiency during insertion; finally, simply inverting bytes cannot guarantee that two consecutive keys are in separate partitions, and if consecutive keys are defined in the same partition, the multi-threaded workload will compete for resources within that partition. Another method for overcoming the disadvantages of monotonically increasing keys is to generate random keys. This has substantially the same advantages and disadvantages as the method of inverting bytes of the monotonically increasing key described above. There is a need for a key generation process that achieves all of these goals to help maximize the performance benefits of database partitioning.
The present application is directed to solving at least one of the above-mentioned background art or other related technical problems.
Disclosure of Invention
In order to solve at least one or other related technical problems mentioned in the background art, the application provides a database and table dividing method for gene injection partition keys, which solves the problems of large data volume of a gene list table, wide bearing service range, complex calculation and the like.
A database and table dividing method for gene injection partition keys specifically comprises the following steps:
when the registration of new gene information is detected, judging whether a database instance needs to be created for the new gene information according to a preset gene information configuration strategy;
if the database instance needs to be created for the new gene information, acquiring a preset database partitioning configuration strategy from a remote configuration center, automatically creating a corresponding new database instance for the new gene information according to the preset database partitioning configuration strategy, and creating a binding relationship between the new gene information and the new database instance;
if the database instance does not need to be created for the new gene information, binding the new gene information to the original database instance;
when detecting that the data volume in any one of the service tables in the new database instance or the original database instance reaches the data volume limited in the preset table dividing strategy during the automatic table dividing strategy scanning, dividing the service tables according to the preset table dividing configuration strategy;
in an information processing system, a table having a plurality of gene information is divided into a plurality of partition keys each having a partition key of a corresponding category, and the partition key keys generated from the gene information are assigned to different partition keys in accordance with different gene information.
Further, the partition key generating step includes:
1. generating a sub-range mark number positioned in the designated partition key;
2. generating a sequence number of a specific key value in the appointed sub-range;
3. the partition key is generated by combining the partition key sub-range index number and the sequence number.
Further, step S500 further includes the following steps:
(1) Maintaining, for each of the partition keys, an activity indicator indicating a most recent activity in the partition key, and responding to requests for the partition key;
(2) The method further includes selecting a partition key for the partition key having the least recent activity indicated by the activity indicator, and generating a partition key from a range of partition key keys corresponding to the selected partition key.
Further, the selecting the partition key for the partition key having the least recent activity indicated by the activity indicator includes:
(a) Determining one or more partition keys, the one or more partition keys having a minimum number of requesters accessing the partition key;
(b) Selecting a partition key if only one partition key has the lowest count; and
(c) If the plurality of partition keys has the lowest count, the partition key having the largest number of keys that have been reserved but not allocated is selected from the plurality of partition keys.
Further, the method further comprises: scheduling corresponding number of branch tables from the waiting queue of any scheduling unit according to the current concurrency number of each branch base, which specifically comprises the following steps:
sequencing according to the current concurrency number of each branch base, and determining the scheduling priority of the branch tables belonging to different branch bases, wherein the lower the current concurrency number of the branch base is, the higher the scheduling priority of the branch table in the branch base with the lower current concurrency number is;
determining the scheduling number according to the number of idle threads existing in any scheduling unit;
and after the sub-table in the first sub-base is determined to be the sub-table with the highest scheduling priority, the sub-tables belonging to the first sub-base are scheduled from the waiting queue according to the scheduling number.
Further, before scheduling a corresponding number of sub-tables from the wait queue of any scheduling unit, the method further includes:
reading the number of sub-tables belonging to a first sub-library in the waiting queue;
judging whether the number of the sub-tables belonging to the first sub-base is greater than or equal to the scheduling number or not;
if the number is larger than or equal to the scheduling number, the step of scheduling the sub-tables with the corresponding number from the waiting queue according to the scheduling number is carried out;
and if the number of the sub-databases is less than the scheduling number, scheduling the sub-tables belonging to the first sub-database and the other sub-databases from the waiting queue according to the scheduling number, wherein the other sub-databases are the sub-databases of which the current concurrency number is greater than that of the first sub-database.
A banking and tabulation apparatus comprising:
a memory storing a computer program which, when executed by the apparatus, performs the method;
a processor for executing the computer program.
A storage medium, characterized by: the storage medium stores computer instructions, and the computer instructions realize the method when executed.
The above-described preferred conditions may be combined with each other to obtain a specific embodiment, in accordance with common knowledge in the art.
The effects provided in the summary of the invention are only the effects of the embodiments, not all of the effects of the invention, and one of the above technical solutions has the following advantages or beneficial effects:
the method for dividing the database into the tables by the gene injection partition key can be realized according to multiple dimensions, the accurate operation and maintenance of the application system can be realized, the operation and maintenance workload of the application system can be saved to a great extent, the use safety of the tables is improved, the accurate operation and maintenance is realized by dividing the database, the operation and maintenance workload is saved, and the operation and maintenance labor intensity is reduced.
The invention adopts the technical scheme for achieving the purpose, makes up the defects of the prior art, and has reasonable design and convenient operation.
Drawings
The accompanying drawings, which are included to provide a further understanding of the disclosure and are incorporated in and constitute a part of this specification, illustrate embodiments and together with the description serve to explain the principles of the disclosure and to explain the principles of the disclosure.
FIG. 1 is a flow chart of the database and table division provided by the embodiment of the present invention;
fig. 2 is a step of generating a partition key according to an embodiment of the present invention.
Detailed Description
In order to clearly explain the technical features of the present invention, the present application will be explained in detail by the following embodiments in combination with the accompanying drawings. The following disclosure provides many different embodiments, or examples, for implementing different results of the invention. To simplify the disclosure of the present invention, specific example components and arrangements are described below. Furthermore, the present invention may repeat reference numerals and/or letters in the various examples, which have been repeated for purposes of simplicity and clarity and do not in themselves dictate a relationship between the various embodiments and/or arrangements discussed. It should be noted that the components illustrated in the figures are not necessarily drawn to scale. Descriptions of well-known components and processing techniques and procedures are omitted so as to not unnecessarily limit the invention.
The present application is specifically described below with specific examples.
Example 1:
as shown in FIG. 1, a method for partitioning a library and table by gene injection partition keys is provided, which comprises the following steps.
And S100, judging whether a database instance needs to be created for the new gene information according to a preset gene information configuration strategy after the registration of the new gene information is detected.
S200, if the database instance needs to be created for the new gene information, acquiring a preset database partitioning configuration strategy from a remote configuration center, automatically creating a corresponding new database instance for the new gene information according to the preset database partitioning configuration strategy, and creating a binding relationship between the new gene information and the new database instance.
S300, if the database instance does not need to be created for the new gene information, the new gene information is bound into the original database instance.
S400, when the fact that the data volume in any one of the service tables in the new database instance or the original database instance reaches the data volume limited in the preset table dividing strategy is detected during automatic table dividing strategy scanning, the service tables are divided according to the preset table dividing configuration strategy.
S500, in the information processing system, dividing a table having a plurality of gene information into a plurality of partition keys, each of the partition keys having a partition key of a corresponding category, the partition key keys generated from the gene information being assigned to different ones of the partition keys according to different ones of the gene information, as shown in fig. 2, the partition key generating step includes: firstly, generating a sub-range mark number positioned in the specified partition key; secondly, generating a sequence number of a specific key value in the appointed sub-range; third, the partition key is then generated by combining the partition key sub-range flag and sequence number.
Example 2:
on the basis of the foregoing embodiment, step S500 further includes the steps of:
(1) Maintaining, for each of the partition keys, an activity indicator indicating a most recent activity in the partition key, and responding to requests for the partition key;
(2) The method further includes selecting a partition key for the partition key having the least recent activity indicated by the activity indicator, and generating a partition key from a range of partition key keys corresponding to the selected partition key.
Said selecting a partition key for the partition key having the least recent activity indicated by the activity indicator comprises:
(a) Determining one or more partition keys, the one or more partition keys having a minimum number of requestors to access the partition key;
(b) Selecting a partition key if only one partition key has the lowest count; and
(c) If the plurality of partition keys has the lowest count, the partition key having the largest number of keys that have been retained but unassigned is selected from the plurality of partition keys.
Example 3:
on the basis of the foregoing embodiment, the method for sorting and tabulating includes: scheduling corresponding number of branch tables from the waiting queue of any scheduling unit according to the current concurrency number of each branch base, which specifically comprises the following steps:
sequencing according to the current concurrency number of each branch base, and determining the scheduling priority of the branch tables belonging to different branch bases, wherein the lower the current concurrency number of the branch bases is, the higher the scheduling priority of the branch table in the branch base with the lower current concurrency number is;
determining the scheduling number according to the number of idle threads existing in any scheduling unit;
and after the sub-table in the first sub-base is determined to be the sub-table with the highest scheduling priority, the sub-tables belonging to the first sub-base are scheduled from the waiting queue according to the scheduling number.
Before scheduling the corresponding number of sub-tables from the waiting queue of any one scheduling unit, the method further includes:
reading the number of sub-tables belonging to a first sub-library in the waiting queue;
judging whether the number of the sub-tables belonging to the first sub-base is greater than or equal to the scheduling number;
if the number is larger than or equal to the scheduling number, the step of scheduling the corresponding number of sub-tables from the waiting queue according to the scheduling number is carried out;
and if the number of the sub-databases is less than the scheduling number, scheduling the sub-tables belonging to the first sub-database and the other sub-databases from the waiting queue according to the scheduling number, wherein the other sub-databases are the sub-databases of which the current concurrency number is greater than that of the first sub-database.
Example 4:
on the basis of the foregoing embodiments, there is provided a database and table dividing apparatus, including:
a memory storing a computer program which, when executed by the apparatus, performs the method;
a processor for executing the computer program.
Example 5:
on the basis of the foregoing embodiments, a storage medium is provided, which stores computer instructions and when executed implements the foregoing method.
It should be understood that while the invention has been described in conjunction with the accompanying drawings and detailed description, the foregoing description is not intended to limit the scope of the invention. A person skilled in the art may, on the basis of the above description, make modifications or alterations of different forms, the results of which still fall within the scope of protection of the present application. And thus are neither necessary nor exhaustive of all embodiments. On the basis of the technical solution of the present invention, those skilled in the art can make various modifications or variations without creative efforts and still be within the scope of the present invention.
Furthermore, the detailed description of the present application is not repeated herein, as it is well known in the art.
Claims (8)
1. A method for partitioning a library and a table by gene injection partition keys is characterized by comprising the following steps:
s100, judging whether a database instance needs to be created for the new gene information according to a preset gene information configuration strategy after the registration of the new gene information is detected;
s200, if the database instance needs to be created for the new gene information, acquiring a preset database partitioning configuration strategy from a remote configuration center, automatically creating a corresponding new database instance for the new gene information according to the preset database partitioning configuration strategy, and creating a binding relationship between the new gene information and the new database instance;
s300, if the database instance does not need to be created for the new gene information, binding the new gene information to the original database instance;
s400, when the data volume in any one of the service tables in the new database instance or the original database instance is detected to reach the data volume limited in the preset table dividing strategy during the automatic table dividing strategy scanning, performing table dividing on the service tables according to the preset table dividing configuration strategy;
s500, in the information processing system, a plurality of partition keys are divided from a partition table with a plurality of pieces of gene information, each partition key is provided with a partition key of a corresponding category, and the partition key keys generated by the gene information are respectively distributed to different partition keys according to different pieces of gene information.
2. The method according to claim 1, wherein the partition key generating step comprises:
1. generating a sub-range marker number located in the designated partition key;
2. generating a sequence number of a specific key value in the designated sub-range;
3. the partition key is generated by combining the partition key sub-range index number and the sequence number.
3. The method of claim 1, wherein step S500 further comprises the steps of:
(1) Maintaining, for each of the partition keys, an activity indicator indicating a most recent activity in the partition key, and responding to requests for the partition key;
(2) The method further includes selecting a partition key for the partition key having the least recent activity indicated by the activity indicator, and generating a partition key from a range of partition key keys corresponding to the selected partition key.
4. The method of claim 3, wherein selecting the partition key for the partition key having the least recent activity indicated by the activity indicator comprises:
(a) Determining one or more partition keys, the one or more partition keys having a minimum number of requesters accessing the partition key;
(b) Selecting a partition key if only one partition key has the lowest count; and
(c) If the plurality of partition keys has the lowest count, the partition key having the largest number of keys that have been reserved but not allocated is selected from the plurality of partition keys.
5. The method according to claim 4, characterized in that the method further comprises: scheduling corresponding number of branch tables from the waiting queue of any scheduling unit according to the current concurrency number of each branch base, which specifically comprises the following steps:
sequencing according to the current concurrency number of each branch base, and determining the scheduling priority of the branch tables belonging to different branch bases, wherein the lower the current concurrency number of the branch base is, the higher the scheduling priority of the branch table in the branch base with the lower current concurrency number is;
determining the scheduling number according to the number of idle threads in any scheduling unit;
and after the sub-table in the first sub-base is determined to be the sub-table with the highest scheduling priority, the sub-tables belonging to the first sub-base are scheduled from the waiting queue according to the scheduling number.
6. The method of claim 5, wherein before scheduling the corresponding number of sub-tables from the wait queue of any one scheduling unit, further comprising:
reading the number of sub-tables belonging to a first sub-library in the waiting queue;
judging whether the number of the sub-tables belonging to the first sub-base is greater than or equal to the scheduling number;
if the number is larger than or equal to the scheduling number, the step of scheduling the sub-tables with the corresponding number from the waiting queue according to the scheduling number is carried out;
and if the quantity is smaller than the scheduling quantity, scheduling branch tables belonging to the first branch base and other branch bases from the waiting queue according to the scheduling quantity, wherein the other branch bases are branch bases with the current concurrency number larger than that of the first branch base.
7. A kind of database and table dividing equipment, characterized by comprising:
a memory storing a computer program that, when executed by the apparatus, performs the method of any of claims 1-6;
a processor for executing the computer program.
8. A storage medium, characterized by: the storage medium stores computer instructions that, when executed, implement the method of any of claims 1-6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211354004.0A CN115563123A (en) | 2022-11-01 | 2022-11-01 | Database and table dividing method for gene injection partition keys |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211354004.0A CN115563123A (en) | 2022-11-01 | 2022-11-01 | Database and table dividing method for gene injection partition keys |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115563123A true CN115563123A (en) | 2023-01-03 |
Family
ID=84769435
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211354004.0A Pending CN115563123A (en) | 2022-11-01 | 2022-11-01 | Database and table dividing method for gene injection partition keys |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115563123A (en) |
-
2022
- 2022-11-01 CN CN202211354004.0A patent/CN115563123A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7900008B2 (en) | Disk space allocation | |
US9805077B2 (en) | Method and system for optimizing data access in a database using multi-class objects | |
US11514040B2 (en) | Global dictionary for database management systems | |
US7480653B2 (en) | System and method for selective partition locking | |
CA2150745C (en) | Method and apparatus for implementing partial declustering in a parallel database system | |
US7403952B2 (en) | Numa system resource descriptors including performance characteristics | |
US6754656B1 (en) | System and method for selective partition locking | |
US8051422B2 (en) | Resource assignment method for query partioning based on processing cost of each partition | |
US20160232206A1 (en) | Database management system and computer system | |
CN104871153B (en) | Method and system for distributed MPP database | |
JP2017533532A (en) | Load balancing for large in-memory databases | |
US6564221B1 (en) | Random sampling of rows in a parallel processing database system | |
CN110806942B (en) | Data processing method and device | |
CN111444149A (en) | Data import method, device, equipment and storage medium | |
CN111984425A (en) | Memory management method, device and equipment for operating system | |
CN115563123A (en) | Database and table dividing method for gene injection partition keys | |
CN115629822B (en) | Concurrent transaction processing method and system based on multi-core processor | |
CN115878910A (en) | Line query method, device and storage medium | |
CN114969189A (en) | Method and device for determining connection in database connection pool | |
CN1333346C (en) | Method for accessing files | |
JPH02162439A (en) | Free list control system for shared memory | |
CN112732723A (en) | Method for improving Elasticissearch concurrent retrieval efficiency | |
KR100617370B1 (en) | Page Allocation Method for Supporting Extents Managed by Bit Maps in Storage System | |
CN111737257A (en) | Data query method and device | |
CN113535742B (en) | Partition-based concurrency control method under multi-master cloud database scene |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |