CN115563123A - Database and table dividing method for gene injection partition keys - Google Patents

Database and table dividing method for gene injection partition keys Download PDF

Info

Publication number
CN115563123A
CN115563123A CN202211354004.0A CN202211354004A CN115563123A CN 115563123 A CN115563123 A CN 115563123A CN 202211354004 A CN202211354004 A CN 202211354004A CN 115563123 A CN115563123 A CN 115563123A
Authority
CN
China
Prior art keywords
partition
sub
scheduling
partition key
keys
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211354004.0A
Other languages
Chinese (zh)
Inventor
彭恒
游际宇
孙卫东
李克非
韩志芳
魏亚贞
蒋倩
陈琳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Aerospace Great Wall Satellite Navigation Technology Co ltd
Original Assignee
Beijing Aerospace Great Wall Satellite Navigation Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Aerospace Great Wall Satellite Navigation Technology Co ltd filed Critical Beijing Aerospace Great Wall Satellite Navigation Technology Co ltd
Priority to CN202211354004.0A priority Critical patent/CN115563123A/en
Publication of CN115563123A publication Critical patent/CN115563123A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor

Abstract

The invention relates to a database and table dividing method for gene injection partition keys, which comprises the following steps: when the registration of new gene information is detected, judging whether a database example needs to be created for the new gene information according to a preset gene information configuration strategy; when detecting that the data volume in any service table in the database example reaches the data volume limited in the preset table dividing strategy during the automatic table dividing strategy scanning, dividing the service table; and dividing a sub-table with a plurality of gene information into a plurality of partition keys, wherein each partition key has a partition key of a corresponding category, and the partition key keys generated by the gene information are respectively distributed to different partition keys according to different gene information. The sub-warehouse and sub-table can be realized according to multiple dimensions, the accurate operation and maintenance of the application system can be realized, the operation and maintenance workload of the application system can be saved to a great extent, the use safety of the sub-table is improved, the accurate operation and maintenance is realized through the sub-warehouse, the operation and maintenance workload is saved, and the operation and maintenance labor intensity is reduced.

Description

Database and table dividing method for gene injection partition keys
Technical Field
The invention relates to the technical field of data processing, in particular to a database and table dividing method for gene injection partition keys.
Background
After the data volume of the gene information is accumulated to a certain degree, the problem of data processing bottleneck of an application system can be solved. The method mainly solves the problems of large single-table data volume, wide service bearing range and complex calculation, and can relieve the problems through database splitting. The existing partial library and partial table method mostly adopts the principle of average distribution, and has the defects of low data distribution rule degree, scattered associated data distribution, need of global search when checking and modifying partial data and the like.
Many applications use the generated key for new data, where the key also serves as a partition key, and uses range partitions to partition data. Range partitioning a partition is selected by determining whether a partition key is within a range of values. A common method of key generation is to use monotonically increasing keys by increasing successive key values, which has the advantage of keeping the data of the key cluster generated organized in order, and also the advantage of allocating space for new data at the end of a partition, which is usually much more efficient than inserting in the middle of existing data. One drawback of this approach is that data cannot be immediately propagated across partitions. Another disadvantage of using monotonically increasing keys is that when concurrent accesses occur in a multi-threaded application environment, a "hot spot" is formed in the database, both during the initial insertion of new data and during subsequent accesses of new data. However, conflicts on "hot spots" often cause contention between threads due to locking in local areas of the database and serialization in space allocation. In addition, if multiple data records are inserted for each generated key, a multi-threaded application may not achieve the full advantage of maintaining the order of the inserted data because two competing threads with sequentially generated keys may interleave their data. When the underlying database management system uses page-level locking, interleaving data on the same page can cause deadlock between application threads that are concurrently updating other unrelated records.
One method commonly used to overcome the above disadvantages is to invert the bytes of the monotonically increasing key, which can have the effect of spreading new data evenly and continuously throughout the key range, placing it in various partitions in a round-robin fashion. Hot spots are avoided because each subsequent insertion is at another point in the database. However, this approach does not maintain a good organization of the data, and may require more frequent data reorganization; moreover, the insertion is not done at the end of the pre-existing data, thus sacrificing efficiency during insertion; finally, simply inverting bytes cannot guarantee that two consecutive keys are in separate partitions, and if consecutive keys are defined in the same partition, the multi-threaded workload will compete for resources within that partition. Another method for overcoming the disadvantages of monotonically increasing keys is to generate random keys. This has substantially the same advantages and disadvantages as the method of inverting bytes of the monotonically increasing key described above. There is a need for a key generation process that achieves all of these goals to help maximize the performance benefits of database partitioning.
The present application is directed to solving at least one of the above-mentioned background art or other related technical problems.
Disclosure of Invention
In order to solve at least one or other related technical problems mentioned in the background art, the application provides a database and table dividing method for gene injection partition keys, which solves the problems of large data volume of a gene list table, wide bearing service range, complex calculation and the like.
A database and table dividing method for gene injection partition keys specifically comprises the following steps:
when the registration of new gene information is detected, judging whether a database instance needs to be created for the new gene information according to a preset gene information configuration strategy;
if the database instance needs to be created for the new gene information, acquiring a preset database partitioning configuration strategy from a remote configuration center, automatically creating a corresponding new database instance for the new gene information according to the preset database partitioning configuration strategy, and creating a binding relationship between the new gene information and the new database instance;
if the database instance does not need to be created for the new gene information, binding the new gene information to the original database instance;
when detecting that the data volume in any one of the service tables in the new database instance or the original database instance reaches the data volume limited in the preset table dividing strategy during the automatic table dividing strategy scanning, dividing the service tables according to the preset table dividing configuration strategy;
in an information processing system, a table having a plurality of gene information is divided into a plurality of partition keys each having a partition key of a corresponding category, and the partition key keys generated from the gene information are assigned to different partition keys in accordance with different gene information.
Further, the partition key generating step includes:
1. generating a sub-range mark number positioned in the designated partition key;
2. generating a sequence number of a specific key value in the appointed sub-range;
3. the partition key is generated by combining the partition key sub-range index number and the sequence number.
Further, step S500 further includes the following steps:
(1) Maintaining, for each of the partition keys, an activity indicator indicating a most recent activity in the partition key, and responding to requests for the partition key;
(2) The method further includes selecting a partition key for the partition key having the least recent activity indicated by the activity indicator, and generating a partition key from a range of partition key keys corresponding to the selected partition key.
Further, the selecting the partition key for the partition key having the least recent activity indicated by the activity indicator includes:
(a) Determining one or more partition keys, the one or more partition keys having a minimum number of requesters accessing the partition key;
(b) Selecting a partition key if only one partition key has the lowest count; and
(c) If the plurality of partition keys has the lowest count, the partition key having the largest number of keys that have been reserved but not allocated is selected from the plurality of partition keys.
Further, the method further comprises: scheduling corresponding number of branch tables from the waiting queue of any scheduling unit according to the current concurrency number of each branch base, which specifically comprises the following steps:
sequencing according to the current concurrency number of each branch base, and determining the scheduling priority of the branch tables belonging to different branch bases, wherein the lower the current concurrency number of the branch base is, the higher the scheduling priority of the branch table in the branch base with the lower current concurrency number is;
determining the scheduling number according to the number of idle threads existing in any scheduling unit;
and after the sub-table in the first sub-base is determined to be the sub-table with the highest scheduling priority, the sub-tables belonging to the first sub-base are scheduled from the waiting queue according to the scheduling number.
Further, before scheduling a corresponding number of sub-tables from the wait queue of any scheduling unit, the method further includes:
reading the number of sub-tables belonging to a first sub-library in the waiting queue;
judging whether the number of the sub-tables belonging to the first sub-base is greater than or equal to the scheduling number or not;
if the number is larger than or equal to the scheduling number, the step of scheduling the sub-tables with the corresponding number from the waiting queue according to the scheduling number is carried out;
and if the number of the sub-databases is less than the scheduling number, scheduling the sub-tables belonging to the first sub-database and the other sub-databases from the waiting queue according to the scheduling number, wherein the other sub-databases are the sub-databases of which the current concurrency number is greater than that of the first sub-database.
A banking and tabulation apparatus comprising:
a memory storing a computer program which, when executed by the apparatus, performs the method;
a processor for executing the computer program.
A storage medium, characterized by: the storage medium stores computer instructions, and the computer instructions realize the method when executed.
The above-described preferred conditions may be combined with each other to obtain a specific embodiment, in accordance with common knowledge in the art.
The effects provided in the summary of the invention are only the effects of the embodiments, not all of the effects of the invention, and one of the above technical solutions has the following advantages or beneficial effects:
the method for dividing the database into the tables by the gene injection partition key can be realized according to multiple dimensions, the accurate operation and maintenance of the application system can be realized, the operation and maintenance workload of the application system can be saved to a great extent, the use safety of the tables is improved, the accurate operation and maintenance is realized by dividing the database, the operation and maintenance workload is saved, and the operation and maintenance labor intensity is reduced.
The invention adopts the technical scheme for achieving the purpose, makes up the defects of the prior art, and has reasonable design and convenient operation.
Drawings
The accompanying drawings, which are included to provide a further understanding of the disclosure and are incorporated in and constitute a part of this specification, illustrate embodiments and together with the description serve to explain the principles of the disclosure and to explain the principles of the disclosure.
FIG. 1 is a flow chart of the database and table division provided by the embodiment of the present invention;
fig. 2 is a step of generating a partition key according to an embodiment of the present invention.
Detailed Description
In order to clearly explain the technical features of the present invention, the present application will be explained in detail by the following embodiments in combination with the accompanying drawings. The following disclosure provides many different embodiments, or examples, for implementing different results of the invention. To simplify the disclosure of the present invention, specific example components and arrangements are described below. Furthermore, the present invention may repeat reference numerals and/or letters in the various examples, which have been repeated for purposes of simplicity and clarity and do not in themselves dictate a relationship between the various embodiments and/or arrangements discussed. It should be noted that the components illustrated in the figures are not necessarily drawn to scale. Descriptions of well-known components and processing techniques and procedures are omitted so as to not unnecessarily limit the invention.
The present application is specifically described below with specific examples.
Example 1:
as shown in FIG. 1, a method for partitioning a library and table by gene injection partition keys is provided, which comprises the following steps.
And S100, judging whether a database instance needs to be created for the new gene information according to a preset gene information configuration strategy after the registration of the new gene information is detected.
S200, if the database instance needs to be created for the new gene information, acquiring a preset database partitioning configuration strategy from a remote configuration center, automatically creating a corresponding new database instance for the new gene information according to the preset database partitioning configuration strategy, and creating a binding relationship between the new gene information and the new database instance.
S300, if the database instance does not need to be created for the new gene information, the new gene information is bound into the original database instance.
S400, when the fact that the data volume in any one of the service tables in the new database instance or the original database instance reaches the data volume limited in the preset table dividing strategy is detected during automatic table dividing strategy scanning, the service tables are divided according to the preset table dividing configuration strategy.
S500, in the information processing system, dividing a table having a plurality of gene information into a plurality of partition keys, each of the partition keys having a partition key of a corresponding category, the partition key keys generated from the gene information being assigned to different ones of the partition keys according to different ones of the gene information, as shown in fig. 2, the partition key generating step includes: firstly, generating a sub-range mark number positioned in the specified partition key; secondly, generating a sequence number of a specific key value in the appointed sub-range; third, the partition key is then generated by combining the partition key sub-range flag and sequence number.
Example 2:
on the basis of the foregoing embodiment, step S500 further includes the steps of:
(1) Maintaining, for each of the partition keys, an activity indicator indicating a most recent activity in the partition key, and responding to requests for the partition key;
(2) The method further includes selecting a partition key for the partition key having the least recent activity indicated by the activity indicator, and generating a partition key from a range of partition key keys corresponding to the selected partition key.
Said selecting a partition key for the partition key having the least recent activity indicated by the activity indicator comprises:
(a) Determining one or more partition keys, the one or more partition keys having a minimum number of requestors to access the partition key;
(b) Selecting a partition key if only one partition key has the lowest count; and
(c) If the plurality of partition keys has the lowest count, the partition key having the largest number of keys that have been retained but unassigned is selected from the plurality of partition keys.
Example 3:
on the basis of the foregoing embodiment, the method for sorting and tabulating includes: scheduling corresponding number of branch tables from the waiting queue of any scheduling unit according to the current concurrency number of each branch base, which specifically comprises the following steps:
sequencing according to the current concurrency number of each branch base, and determining the scheduling priority of the branch tables belonging to different branch bases, wherein the lower the current concurrency number of the branch bases is, the higher the scheduling priority of the branch table in the branch base with the lower current concurrency number is;
determining the scheduling number according to the number of idle threads existing in any scheduling unit;
and after the sub-table in the first sub-base is determined to be the sub-table with the highest scheduling priority, the sub-tables belonging to the first sub-base are scheduled from the waiting queue according to the scheduling number.
Before scheduling the corresponding number of sub-tables from the waiting queue of any one scheduling unit, the method further includes:
reading the number of sub-tables belonging to a first sub-library in the waiting queue;
judging whether the number of the sub-tables belonging to the first sub-base is greater than or equal to the scheduling number;
if the number is larger than or equal to the scheduling number, the step of scheduling the corresponding number of sub-tables from the waiting queue according to the scheduling number is carried out;
and if the number of the sub-databases is less than the scheduling number, scheduling the sub-tables belonging to the first sub-database and the other sub-databases from the waiting queue according to the scheduling number, wherein the other sub-databases are the sub-databases of which the current concurrency number is greater than that of the first sub-database.
Example 4:
on the basis of the foregoing embodiments, there is provided a database and table dividing apparatus, including:
a memory storing a computer program which, when executed by the apparatus, performs the method;
a processor for executing the computer program.
Example 5:
on the basis of the foregoing embodiments, a storage medium is provided, which stores computer instructions and when executed implements the foregoing method.
It should be understood that while the invention has been described in conjunction with the accompanying drawings and detailed description, the foregoing description is not intended to limit the scope of the invention. A person skilled in the art may, on the basis of the above description, make modifications or alterations of different forms, the results of which still fall within the scope of protection of the present application. And thus are neither necessary nor exhaustive of all embodiments. On the basis of the technical solution of the present invention, those skilled in the art can make various modifications or variations without creative efforts and still be within the scope of the present invention.
Furthermore, the detailed description of the present application is not repeated herein, as it is well known in the art.

Claims (8)

1. A method for partitioning a library and a table by gene injection partition keys is characterized by comprising the following steps:
s100, judging whether a database instance needs to be created for the new gene information according to a preset gene information configuration strategy after the registration of the new gene information is detected;
s200, if the database instance needs to be created for the new gene information, acquiring a preset database partitioning configuration strategy from a remote configuration center, automatically creating a corresponding new database instance for the new gene information according to the preset database partitioning configuration strategy, and creating a binding relationship between the new gene information and the new database instance;
s300, if the database instance does not need to be created for the new gene information, binding the new gene information to the original database instance;
s400, when the data volume in any one of the service tables in the new database instance or the original database instance is detected to reach the data volume limited in the preset table dividing strategy during the automatic table dividing strategy scanning, performing table dividing on the service tables according to the preset table dividing configuration strategy;
s500, in the information processing system, a plurality of partition keys are divided from a partition table with a plurality of pieces of gene information, each partition key is provided with a partition key of a corresponding category, and the partition key keys generated by the gene information are respectively distributed to different partition keys according to different pieces of gene information.
2. The method according to claim 1, wherein the partition key generating step comprises:
1. generating a sub-range marker number located in the designated partition key;
2. generating a sequence number of a specific key value in the designated sub-range;
3. the partition key is generated by combining the partition key sub-range index number and the sequence number.
3. The method of claim 1, wherein step S500 further comprises the steps of:
(1) Maintaining, for each of the partition keys, an activity indicator indicating a most recent activity in the partition key, and responding to requests for the partition key;
(2) The method further includes selecting a partition key for the partition key having the least recent activity indicated by the activity indicator, and generating a partition key from a range of partition key keys corresponding to the selected partition key.
4. The method of claim 3, wherein selecting the partition key for the partition key having the least recent activity indicated by the activity indicator comprises:
(a) Determining one or more partition keys, the one or more partition keys having a minimum number of requesters accessing the partition key;
(b) Selecting a partition key if only one partition key has the lowest count; and
(c) If the plurality of partition keys has the lowest count, the partition key having the largest number of keys that have been reserved but not allocated is selected from the plurality of partition keys.
5. The method according to claim 4, characterized in that the method further comprises: scheduling corresponding number of branch tables from the waiting queue of any scheduling unit according to the current concurrency number of each branch base, which specifically comprises the following steps:
sequencing according to the current concurrency number of each branch base, and determining the scheduling priority of the branch tables belonging to different branch bases, wherein the lower the current concurrency number of the branch base is, the higher the scheduling priority of the branch table in the branch base with the lower current concurrency number is;
determining the scheduling number according to the number of idle threads in any scheduling unit;
and after the sub-table in the first sub-base is determined to be the sub-table with the highest scheduling priority, the sub-tables belonging to the first sub-base are scheduled from the waiting queue according to the scheduling number.
6. The method of claim 5, wherein before scheduling the corresponding number of sub-tables from the wait queue of any one scheduling unit, further comprising:
reading the number of sub-tables belonging to a first sub-library in the waiting queue;
judging whether the number of the sub-tables belonging to the first sub-base is greater than or equal to the scheduling number;
if the number is larger than or equal to the scheduling number, the step of scheduling the sub-tables with the corresponding number from the waiting queue according to the scheduling number is carried out;
and if the quantity is smaller than the scheduling quantity, scheduling branch tables belonging to the first branch base and other branch bases from the waiting queue according to the scheduling quantity, wherein the other branch bases are branch bases with the current concurrency number larger than that of the first branch base.
7. A kind of database and table dividing equipment, characterized by comprising:
a memory storing a computer program that, when executed by the apparatus, performs the method of any of claims 1-6;
a processor for executing the computer program.
8. A storage medium, characterized by: the storage medium stores computer instructions that, when executed, implement the method of any of claims 1-6.
CN202211354004.0A 2022-11-01 2022-11-01 Database and table dividing method for gene injection partition keys Pending CN115563123A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211354004.0A CN115563123A (en) 2022-11-01 2022-11-01 Database and table dividing method for gene injection partition keys

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211354004.0A CN115563123A (en) 2022-11-01 2022-11-01 Database and table dividing method for gene injection partition keys

Publications (1)

Publication Number Publication Date
CN115563123A true CN115563123A (en) 2023-01-03

Family

ID=84769435

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211354004.0A Pending CN115563123A (en) 2022-11-01 2022-11-01 Database and table dividing method for gene injection partition keys

Country Status (1)

Country Link
CN (1) CN115563123A (en)

Similar Documents

Publication Publication Date Title
US7900008B2 (en) Disk space allocation
US9805077B2 (en) Method and system for optimizing data access in a database using multi-class objects
US11514040B2 (en) Global dictionary for database management systems
US7480653B2 (en) System and method for selective partition locking
CA2150745C (en) Method and apparatus for implementing partial declustering in a parallel database system
US7403952B2 (en) Numa system resource descriptors including performance characteristics
US6754656B1 (en) System and method for selective partition locking
US8051422B2 (en) Resource assignment method for query partioning based on processing cost of each partition
US20160232206A1 (en) Database management system and computer system
CN104871153B (en) Method and system for distributed MPP database
JP2017533532A (en) Load balancing for large in-memory databases
US6564221B1 (en) Random sampling of rows in a parallel processing database system
CN110806942B (en) Data processing method and device
CN111444149A (en) Data import method, device, equipment and storage medium
CN111984425A (en) Memory management method, device and equipment for operating system
CN115563123A (en) Database and table dividing method for gene injection partition keys
CN115629822B (en) Concurrent transaction processing method and system based on multi-core processor
CN115878910A (en) Line query method, device and storage medium
CN114969189A (en) Method and device for determining connection in database connection pool
CN1333346C (en) Method for accessing files
JPH02162439A (en) Free list control system for shared memory
CN112732723A (en) Method for improving Elasticissearch concurrent retrieval efficiency
KR100617370B1 (en) Page Allocation Method for Supporting Extents Managed by Bit Maps in Storage System
CN111737257A (en) Data query method and device
CN113535742B (en) Partition-based concurrency control method under multi-master cloud database scene

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination