CN117725066A - Partition data merging method, medium and computer equipment of database - Google Patents

Partition data merging method, medium and computer equipment of database Download PDF

Info

Publication number
CN117725066A
CN117725066A CN202311774895.XA CN202311774895A CN117725066A CN 117725066 A CN117725066 A CN 117725066A CN 202311774895 A CN202311774895 A CN 202311774895A CN 117725066 A CN117725066 A CN 117725066A
Authority
CN
China
Prior art keywords
data
partition data
partition
merging
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311774895.XA
Other languages
Chinese (zh)
Inventor
王建华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Kingbase Information Technologies Co Ltd
Original Assignee
Beijing Kingbase Information Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Kingbase Information Technologies Co Ltd filed Critical Beijing Kingbase Information Technologies Co Ltd
Priority to CN202311774895.XA priority Critical patent/CN117725066A/en
Publication of CN117725066A publication Critical patent/CN117725066A/en
Pending legal-status Critical Current

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a partition data merging method, medium and computer equipment of a database. Wherein the method comprises the following steps: obtaining partition data to be combined in a database; determining partition data to be inserted and partition data to be inserted in the partition data; acquiring data files of both partition data to be inserted and partition data to be inserted; merging the data files corresponding to the partition data to be inserted into the data files corresponding to the partition data to be inserted; and synchronously combining the index trees corresponding to the data files to obtain combined partition data. The method omits the operations of line-by-line insertion and index-by-line merging in the process of merging the partition data, thereby greatly improving the speed of merging the partition data.

Description

Partition data merging method, medium and computer equipment of database
Technical Field
The present invention relates to the field of databases, and in particular, to a method, medium, and computer device for merging partition data of a database.
Background
With the advent of the large data age, the rapid growth of data volumes has presented a significant challenge to database management systems. In order to improve the efficiency of data querying and processing, data partitioning techniques are widely used in a variety of database systems. The data partitioning partitions a large data set into smaller, more manageable subsets, thereby improving query performance and data management efficiency.
However, when data partitioning is performed, since the data amounts of the respective partitions are different, it may occur that some partitions have a smaller data amount and other partitions have a larger data amount. This can result in significant time and resources being spent processing the partitions when data consolidation occurs, thereby affecting overall system performance.
Disclosure of Invention
In view of the foregoing, the present invention provides a method, medium, and computer device for merging partition data of a database that overcomes or at least partially solves the foregoing.
It is an object of the present invention to increase the speed of partition data merging.
Obtaining partition data to be combined in a database;
determining partition data to be inserted and partition data to be inserted in the partition data;
acquiring data files of both partition data to be inserted and partition data to be inserted;
merging the data files corresponding to the partition data to be inserted into the data files corresponding to the partition data to be inserted;
and synchronously combining the index trees corresponding to the data files to obtain combined partition data.
Optionally, the step of synchronously merging the index trees corresponding to the data files includes:
acquiring an index tree to be inserted corresponding to partition data to be inserted and an index tree to be inserted corresponding to partition data to be inserted;
the leaf nodes to be inserted into the index tree are merged into the leaf nodes inserted into the index tree.
Optionally, the step of merging the leaf nodes to be inserted into the leaf nodes of the index tree comprises:
increasing the pointer in each leaf node of the index tree to be inserted by the whole offset of the index tree to be inserted, wherein the offset of each index in the index tree to be inserted is recorded in the pointer;
the leaf nodes to be inserted into the index tree are merged into the leaf nodes inserted into the index tree.
Optionally, the step of determining the inserted partition data and the inserted partition data in the partition data includes:
screening target data corresponding to the target version from the partition data;
counting the data quantity of target data in each partition data;
partition data to be inserted and partition data to be inserted are determined according to the data amount.
Optionally, the step of determining the partition data to be inserted and the partition data to be inserted according to the data amount includes:
sorting the partition data according to the data quantity to obtain a data quantity list of the partition data;
and determining the partition data with small data volume as partition data to be inserted according to the data volume list, and taking the partition data with large data volume as partition data to be inserted.
Optionally, the step of acquiring the data file of both the partition data to be inserted and the partition data to be inserted includes:
obtaining partition data to be inserted and target data corresponding to a target version in the partition data to be inserted;
adding a deletion mark for the data to be inserted into the partition and the other data except the target data in the inserted partition;
freezing the target data;
and acquiring a data file corresponding to the target data.
Optionally, the step of freezing the target data includes:
the inserted transaction ID in the target data is changed to a frozen transaction ID, and the frozen transaction ID is used for indicating that the target data is in a frozen state.
Optionally, the step of merging the data file corresponding to the partition data to be inserted into the data file corresponding to the partition data to be inserted includes:
and merging the data files corresponding to the partition data to be inserted into the rear of the data files corresponding to the partition data to be inserted.
According to yet another aspect of the present invention, there is also provided a machine-readable storage medium having stored thereon a machine-executable program which when executed by a processor implements a partition data merging method of a database of any of the above.
According to still another aspect of the present invention, there is also provided a computer apparatus including a memory, a processor, and a machine executable program stored on the memory and running on the processor, and the processor implementing a partition data merging method of any one of the databases described above when executing the machine executable program.
The method for merging partition data of the database comprises the steps of firstly obtaining partition data to be merged in the database, then determining partition data to be inserted and partition data to be inserted in the partition data, then obtaining data files of the partition data to be inserted and the partition data to be inserted, merging the data files corresponding to the partition data to be inserted into the data files corresponding to the partition data to be inserted, and synchronously merging index trees corresponding to the data files, so that merged partition data is obtained. The method omits the operations of line-by-line insertion and index-by-line merging in the process of merging the partition data, thereby greatly improving the speed of merging the partition data.
The above, as well as additional objectives, advantages, and features of the present invention will become apparent to those skilled in the art from the following detailed description of a specific embodiment of the present invention when read in conjunction with the accompanying drawings.
Drawings
Some specific embodiments of the invention will be described in detail hereinafter by way of example and not by way of limitation with reference to the accompanying drawings. The same reference numbers will be used throughout the drawings to refer to the same or like parts or portions. It will be appreciated by those skilled in the art that the drawings are not necessarily drawn to scale. In the accompanying drawings:
FIG. 1 is a flow diagram of a method for merging partitioned data of a database according to one embodiment of the invention;
FIG. 2 is a flow chart of a method of merging partitioned data of a database according to another embodiment of the present invention;
FIG. 3 is a schematic diagram of a machine-readable storage medium in a method of merging partition data of a database according to one embodiment of the invention; and
FIG. 4 is a schematic diagram of a computer device in a method for merging partitioned data of a database according to one embodiment of the invention.
Detailed Description
It should be understood by those skilled in the art that the embodiments described below are only some embodiments of the present invention, but not all embodiments of the present invention, and the some embodiments are intended to explain the technical principles of the present invention and are not intended to limit the scope of the present invention. All other embodiments, which can be obtained by a person skilled in the art without any inventive effort, based on the embodiments provided by the present invention, shall still fall within the scope of protection of the present invention.
FIG. 1 is a flow diagram of a method for merging partitioned data of a database according to one embodiment of the invention. In this embodiment, the process may generally include:
in step S101, partition data to be merged in the database is obtained. In this embodiment, database partitioning is a physical database design technique, and the main purpose of the database partitioning is to reduce the total amount of data read and write in a specific SQL operation to reduce response time. Partitioning is mainly in two forms: horizontal partitions and vertical partitions.
Horizontal partitioning is the partitioning of a row of a table in such a way that sets of data of physical column partitions within different groupings are combined for individual partitioning (single partition) or collective partitioning (1 or more partitions). All columns defined in the table can be found in each dataset, so the properties of the table are still maintained. Such partitioning is typically accomplished by reducing the width of the target table by vertically partitioning the table such that certain columns are partitioned into certain partitions, each partition containing rows corresponding to columns therein.
Vertical partitioning is the partitioning of columns of a table, with some columns of the table being partitioned into one partition and other columns being partitioned into another partition. The partition mode can reduce the data quantity required to be scanned for inquiry and improve the inquiry efficiency.
However, after the above data partitioning, there may be a case where a plurality of partition data needs to be merged, and in this case, the partition data is generally merged one by one after being determined, but when the data size is large, the execution speed of this merging manner is very slow, so that the normal use of the database is affected, and therefore, a method capable of improving the merging speed of the partition data is needed.
Step S102, determining partition data to be inserted and partition data to be inserted in the partition data. In this embodiment, the partition data to be inserted generally means that the partition data needs to be merged into other partition data, and the partition data to be inserted is that the partition data needs to merge other partition data into itself.
Step S103, a data file of both the partition data to be inserted and the partition data to be inserted is acquired.
Step S104, merging the data files corresponding to the partition data to be inserted into the data files corresponding to the partition data to be inserted. In this step, in order to increase the merging speed of the partition data, the data files are directly spliced, so that the database is prevented from merging line by line, and the merging speed of the partition data is greatly increased.
Step S105, synchronously merging index trees corresponding to the data files to obtain merged partition data. In this embodiment, after the data files are integrally combined in step S104, the index tree corresponding to the data files needs to be updated, so as to ensure that the combined partition data can be used normally.
The method omits the operations of line-by-line insertion and index-by-line merging in the process of merging the partition data, thereby greatly improving the speed of merging the partition data.
In some alternative embodiments, the step of synchronously merging the index trees corresponding to the data files may generally include: acquiring an index tree to be inserted corresponding to partition data to be inserted and an index tree to be inserted corresponding to partition data to be inserted; the leaf nodes to be inserted into the index tree are merged into the leaf nodes inserted into the index tree. In this step, after the data files are merged, the corresponding index tree needs to be synchronously updated, and in order to increase the merging speed, the leaf nodes to be inserted into the index tree may be selected to be merged into the leaf nodes to be inserted into the index tree. The index tree corresponding to the partition data generally stores each data and the specific position corresponding to each data through the leaf node, and the father node of the leaf node generally records the position of the leaf node, so that the leaf nodes among a plurality of data partitions can be selected to be combined, and the number of the leaf nodes is generally overlarge or too few in the combining process, so that splitting or combining is required. Those skilled in the art may perform corresponding merging and splitting operations depending on the particular type of index tree.
In this embodiment, the step of merging the leaf nodes to be inserted into the leaf nodes of the index tree may generally include: increasing the overall offset of the inserted partition data by a pointer in each leaf node of the index tree to be inserted, wherein the offset of each row of data in the inserted partition data is recorded in the pointer; the leaf nodes to be inserted into the index tree are merged into the leaf nodes inserted into the index tree. In this step, the merging mode between the partition data is to directly merge the data file corresponding to the partition data to be inserted into the data file corresponding to the partition data to be inserted, so that the offset corresponding to each leaf node in the index tree of the partition data to be inserted is unchanged, but the index tree of the partition data to be inserted needs to integrally offset the data length of the partition data to be inserted, that is, the integral offset of the partition data to be inserted, so that only a simple addition is needed, the updating of the index of the partition data can be realized, and the merging speed of the partition data is further improved.
In other alternative embodiments, the step of determining the inserted partition data and the inserted partition data in the partition data may generally include: screening target data corresponding to the target version from the partition data; counting the data quantity of target data in each partition data; partition data to be inserted and partition data to be inserted are determined according to the data amount. In this step, since the merging speed of the partition data of the database needs to be improved as much as possible, the data amount of each partition data is counted to distinguish the partition data to be inserted from the partition data to be inserted, that is, the partition data with smaller moving data amount is selected, so that the merging speed is improved.
In addition, there is an alternative way such as: the size of the data file corresponding to each partition data or the file size of the file corresponding to the index is directly calculated, and then the data file size is selected as the partition data to be inserted. The person skilled in the art can determine the mode with the fastest speed according to the actual situation to distinguish the partition data to be inserted from the partition data to be inserted.
In this embodiment, the step of determining partition data to be inserted and the partition data to be inserted according to the data amount may generally include: sorting the partition data according to the data quantity to obtain a data quantity list of the partition data; and determining the partition data with small data volume as partition data to be inserted according to the data volume list, and taking the partition data with large data volume as partition data to be inserted. In this step, when the number of partition data to be merged is large, it may be selected to sort the data amount thereof, thereby determining partition data to be inserted and partition data to be inserted by counting the data amount of each partition data.
In some alternative embodiments, the step of obtaining a data file of both the partition data to be inserted and the partition data to be inserted may generally include: obtaining partition data to be inserted and target data corresponding to a target version in the partition data to be inserted; adding a deletion mark for the data to be inserted into the partition and the other data except the target data in the inserted partition; freezing the target data; and acquiring a data file corresponding to the target data. In this step, since most of the current databases are MVCC (multi-version concurrency control, multiversion Concurrency Control) mechanisms, there may be different versions of the same data in each partition data, and in this case, in the merging process, the data of one version is usually used, so in the case of determining the target version, deletion marks may be selected for the data of other versions, so that the partition data can only see the data corresponding to the target version. And then freezing the partition data to be merged, so that the occurrence of errors in merging or query results caused by receiving database operation in the merging process is avoided, and after the processed data file is obtained, the corresponding merging operation can be performed.
In other alternative embodiments, the step of freezing the target data may generally include: the inserted transaction ID in the target data is changed to a frozen transaction ID, and the frozen transaction ID is used for indicating that the target data is in a frozen state. In this step, the insertion transaction ID refers to an insertion transaction ID (XID) of the corresponding version (row version) of the line data. The line version is a specific state of the data line, and a new line version is created for the same logic line in each updating operation, and at this time, the data is in the merging process, so that the data is selected to be changed into the frozen transaction ID, thereby indicating that the target data is in a frozen state, and ensuring the normal execution of the merging process.
In still other alternative embodiments, the step of merging the data file corresponding to the partition data to be inserted into the data file corresponding to the partition data to be inserted may generally include: and merging the data files corresponding to the partition data to be inserted into the rear of the data files corresponding to the partition data to be inserted. In this step, when the partition data to be inserted and the partition data to be inserted are determined, and after the data processing of the partition data and the partition data is completed, a formal merging operation is required, so that in order to increase the merging speed, an operation of directly performing file splicing is selected, thereby avoiding a step of performing line-by-line insertion.
Fig. 2 is a flow chart of a method for merging partition data of a database according to another embodiment of the present invention. In some alternative embodiments, the present process may generally include:
in this embodiment, database partitioning is a physical database design technique, and the main purpose of the database partitioning is to reduce the total amount of data read and write in a specific SQL operation to reduce response time. Partitioning is mainly in two forms: horizontal partitions and vertical partitions. Horizontal partitioning is the partitioning of a row of a table in such a way that sets of data of physical column partitions within different groupings are combined for individual partitioning (single partition) or collective partitioning (1 or more partitions). All columns defined in the table can be found in each dataset, so the properties of the table are still maintained. Such partitioning is typically accomplished by reducing the width of the target table by vertically partitioning the table such that certain columns are partitioned into certain partitions, each partition containing rows corresponding to columns therein. Vertical partitioning is the partitioning of columns of a table, with some columns of the table being partitioned into one partition and other columns being partitioned into another partition. The partition mode can reduce the data quantity required to be scanned for inquiry and improve the inquiry efficiency. However, after the above data partitioning, there may be a case where a plurality of partition data needs to be merged, and in this case, the partition data is generally merged one by one after being determined, but when the data size is large, the execution speed of this merging manner is very slow, so that the normal use of the database is affected, and therefore, a method capable of improving the merging speed of the partition data is needed.
In step S201, partition data to be merged in the database is obtained.
Step S202, target data corresponding to the target version are screened from the partition data. In this step, since most of the current databases are MVCC mechanisms, there may be different versions of the same data in each partition data, and in this case, in the merging process, data of a certain version is usually used, so in the case of determining the target version, screening needs to be performed, so as to determine the data actually needing to be merged.
Step S203, statistics is performed on the data amount of the target data in each partition data. In this step, since the merging speed of the partition data of the database needs to be improved as much as possible, the data amount of each partition data is counted to distinguish the partition data to be inserted from the partition data to be inserted, that is, the partition data with smaller moving data amount is selected, so that the merging speed is improved.
Step S204, sorting the partition data according to the data amount to obtain a data amount list of the partition data.
In step S205, partition data with small data amount is determined as partition data to be inserted from the data amount list, and partition data with large data amount is determined as partition data to be inserted. In this step, when the number of partition data to be merged is large, it may be selected to sort the data amount thereof, thereby determining partition data to be inserted and partition data to be inserted by counting the data amount of each partition data.
In addition, there is an alternative way such as: the size of the data file corresponding to each partition data or the file size of the file corresponding to the index is directly calculated, and then the data file size is selected as the partition data to be inserted. The person skilled in the art can determine the mode with the fastest speed according to the actual situation to distinguish the partition data to be inserted from the partition data to be inserted.
Step S206, obtaining the partition data to be inserted and the target data corresponding to the target version in the partition data to be inserted.
Step S207, adding a deletion mark for the data to be inserted into the partition and the other data except the target data in the inserted partition. In this step, under the condition of determining the target version, deletion marks can be selected to be marked on the data of other versions, so that the partition data can only see the data corresponding to the target version.
Step S208, freezing processing is carried out on the target data. In the step, freezing processing is carried out on the partition data to be merged, so that the situation that the merging or query results are wrong due to the fact that database operation is received in the merging process is avoided, and corresponding merging operation can be carried out after the processed data file is obtained. One of the optional operations is, for example: the inserted transaction ID in the target data is changed to a frozen transaction ID, and the frozen transaction ID is used for indicating that the target data is in a frozen state. The insertion transaction ID refers to an insertion transaction ID (XID) of a version corresponding to the line data. The line version is a specific state of the data line, and a new line version is created for the same logic line in each updating operation, and at this time, the data is in the merging process, so that the data is selected to be changed into the frozen transaction ID, thereby indicating that the target data is in a frozen state, and ensuring the normal execution of the merging process.
In step S209, the data file corresponding to the partition data to be inserted is merged to the rear of the data file corresponding to the partition data to be inserted. In this step, when the partition data to be inserted and the partition data to be inserted are determined, and after the data processing of the partition data and the partition data is completed, a formal merging operation is required, so that in order to increase the merging speed, an operation of directly performing file splicing is selected, thereby avoiding a step of performing line-by-line insertion.
Step S210, synchronously merging index trees corresponding to the data files to obtain merged partition data. In this step, after the data files are merged, the corresponding index tree needs to be updated synchronously, and because the merging mode between the partition data is to directly merge the data files corresponding to the partition data to be inserted into the data files corresponding to the partition data to be inserted, the offset corresponding to each leaf node in the index tree of the partition data to be inserted is unchanged, but the index tree of the partition data to be inserted needs to be offset integrally by the data length of the partition data to be inserted, that is, the integral offset of the partition data to be inserted, so that updating of the index of the partition data can be realized only by simple addition, and the merging speed of the partition data is improved. Then, to continue to increase the merging speed, the leaf nodes to be inserted into the index tree may be selected to be merged into the leaf nodes to be inserted into the index tree. The index tree corresponding to the partition data generally stores each data and the specific position corresponding to each data through the leaf node, and the father node of the leaf node generally records the position of the leaf node, so that the leaf nodes among a plurality of data partitions can be selected to be combined, and the number of the leaf nodes is generally overlarge or too few in the combining process, so that splitting or combining is required. Those skilled in the art may perform corresponding merging and splitting operations depending on the particular type of index tree.
The method omits the operations of line-by-line insertion and index-by-line merging in the process of merging the partition data, thereby greatly improving the speed of merging the partition data.
The present implementation also provides a machine-readable storage medium and a computer device. Fig. 3 is a schematic diagram of a machine-readable storage medium 301 according to one embodiment of the invention, and fig. 4 is a schematic diagram of a computer device 403 according to one embodiment of the invention.
The machine-readable storage medium 301 has stored thereon a machine-executable program 302, which when executed by a processor, implements the partition data merging method of a database of any of the embodiments described above.
The computer device 403 may include a memory 401, a processor 402, and a machine executable program 302 stored on the memory 401 and running on the processor 402, and the processor 402 implements the partition data merging method of the database of any of the embodiments described above when executing the machine executable program 302.
It should be noted that the logic and/or steps represented in the flowcharts or otherwise described herein, e.g., merging data files, may be embodied in any machine-readable storage medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions.
For the purposes of this description of embodiments, a machine-readable storage medium 301 can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the machine-readable storage medium 301 include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the machine-readable storage medium 301 may even be paper or other suitable medium upon which the program is printed, as the program may be electronically captured, via, for instance, optical scanning of the paper or other medium, then compiled, interpreted, or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.
It is to be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above-described embodiments, the various steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system.
The computer device 403 may be, for example, a server, a desktop computer, a notebook computer, a tablet computer, or a smartphone. In some examples, computer device 403 may be a cloud computing node. Computer device 403 may be described in the general context of computer system-executable instructions, such as program modules, being executed by a computer system. Generally, program modules may include routines, programs, objects, components, logic, data structures, etc. that perform particular tasks or implement particular abstract data types. Computer device 403 may be implemented in a distributed cloud computing environment in which remote processing devices that are linked through a communications network perform tasks. In a distributed cloud computing environment, program modules may be located in both local and remote computing system storage media including memory storage devices.
Computer device 403 may include a processor 402 adapted to execute stored instructions, a memory 401 providing temporary storage space for the operation of the instructions during operation. Processor 402 may be a single-core processor, a multi-core processor, a computing cluster, or any number of other configurations. Memory 401 may include Random Access Memory (RAM), read only memory, flash memory, or any other suitable storage system.
The processor 402 may be connected via a system interconnect (e.g., PCI-Express, etc.) to an I/O interface (input/output interface) adapted to connect the computer device 403 to one or more I/O devices (input/output devices). The I/O devices may include, for example, a keyboard and a pointing device, which may include a touch pad or touch screen, among others. The I/O device may be a built-in component of the computer device 403 or may be a device externally connected to the computing device.
The processor 402 may also be linked through a system interconnect to a display interface adapted to connect the computer device 403 to a display device. The display device may include a display screen as a built-in component of the computer device 403. The display device may also include a computer monitor, television, projector, or the like, which is externally connected to the computer device 403. Further, a network interface controller (network interface controller, NIC) may be adapted to connect the computer device 403 to a network through a system interconnect. In some embodiments, the NIC may use any suitable interface or protocol (such as an internet small computer system interface, etc.) to transfer data. The network may be a cellular network, a radio network, a Wide Area Network (WAN), a Local Area Network (LAN), or the internet, among others. The remote device may be connected to the computing device through a network.
The flowcharts provided by this embodiment are not intended to indicate that the operations of the method are to be performed in any particular order, or that all of the operations of the method are included in all of each case. Furthermore, the method may include additional operations. Additional variations may be made to the above-described methods within the scope of the technical ideas provided by the methods of the present embodiments.
By now it should be appreciated by those skilled in the art that while a number of exemplary embodiments of the invention have been shown and described herein in detail, many other variations or modifications of the invention consistent with the principles of the invention may be directly ascertained or inferred from the present disclosure without departing from the spirit and scope of the invention. Accordingly, the scope of the present invention should be understood and deemed to cover all such other variations or modifications.

Claims (10)

1. A method of merging partition data of a database, comprising:
acquiring the partition data to be combined in the database;
determining partition data to be inserted and partition data to be inserted in the partition data;
acquiring data files of the partition data to be inserted and the partition data to be inserted;
merging the data file corresponding to the partition data to be inserted into the data file corresponding to the partition data to be inserted;
and synchronously combining the index trees corresponding to the data files to obtain combined partition data.
2. The method for merging partition data of a database according to claim 1, wherein,
the step of synchronously merging the index trees corresponding to the data files comprises the following steps:
acquiring an index tree to be inserted corresponding to the partition data to be inserted and an index tree to be inserted corresponding to the partition data to be inserted;
and merging the leaf nodes of the index tree to be inserted into the leaf nodes of the index tree to be inserted.
3. The method for merging partition data of a database according to claim 2, wherein,
the step of merging the leaf nodes of the index tree to be inserted into the leaf nodes of the index tree to be inserted includes:
increasing the overall offset of the inserted partition data by a pointer in each leaf node of the index tree to be inserted, wherein the offset of each data in the inserted partition data is recorded in the pointer;
and merging the leaf nodes of the index tree to be inserted into the leaf nodes of the index tree to be inserted.
4. The method for merging partition data of a database according to claim 1, wherein,
the step of determining the inserted partition data among the partition data includes:
screening target data corresponding to the target version from the partition data;
counting the data quantity of the target data in each partition data;
and determining the partition data to be inserted and the partition data to be inserted according to the data quantity.
5. The method for merging partition data of a database according to claim 4, wherein,
the step of determining the partition data to be inserted and the partition data to be inserted according to the data amount includes:
sequencing the partition data according to the data quantity to obtain a data quantity list of the partition data;
and determining the partition data with small data volume as the partition data to be inserted according to the data volume list, and taking the partition data with large data volume as the partition data to be inserted.
6. The method for merging partition data of a database according to claim 4, wherein,
the step of acquiring the data file of both the partition data to be inserted and the partition data to be inserted includes:
acquiring the partition data to be inserted and target data corresponding to a target version in the partition data to be inserted;
adding a deletion mark for the data to be inserted into the partition data and other data except the target data in the inserted partition data;
freezing the target data;
and acquiring a data file corresponding to the target data.
7. The method for merging partition data of a database according to claim 6, wherein,
the step of freezing the target data includes:
changing the inserted transaction ID in the target data into a frozen transaction ID, wherein the frozen transaction ID is used for indicating that the target data is in a frozen state.
8. The method for merging partition data of a database according to claim 7, wherein,
the step of merging the data file corresponding to the partition data to be inserted into the data file corresponding to the partition data to be inserted includes:
and merging the data file corresponding to the partition data to be inserted into the rear of the data file corresponding to the partition data to be inserted.
9. A machine-readable storage medium having stored thereon a machine-executable program which when executed by a processor implements a partition data merging method of a database according to any one of claims 1 to 8.
10. A computer device comprising a memory, a processor and a machine executable program stored on the memory and running on the processor, and the processor implementing a method of partition data merging of a database according to any one of claims 1 to 8 when executing the machine executable program.
CN202311774895.XA 2023-12-21 2023-12-21 Partition data merging method, medium and computer equipment of database Pending CN117725066A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311774895.XA CN117725066A (en) 2023-12-21 2023-12-21 Partition data merging method, medium and computer equipment of database

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311774895.XA CN117725066A (en) 2023-12-21 2023-12-21 Partition data merging method, medium and computer equipment of database

Publications (1)

Publication Number Publication Date
CN117725066A true CN117725066A (en) 2024-03-19

Family

ID=90205055

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311774895.XA Pending CN117725066A (en) 2023-12-21 2023-12-21 Partition data merging method, medium and computer equipment of database

Country Status (1)

Country Link
CN (1) CN117725066A (en)

Similar Documents

Publication Publication Date Title
US11514046B2 (en) Tiering with pluggable storage system for parallel query engines
CN107239392B (en) Test method, test device, test terminal and storage medium
CN110795455A (en) Dependency relationship analysis method, electronic device, computer device and readable storage medium
CN108536745B (en) Shell-based data table extraction method, terminal, equipment and storage medium
WO2018129500A1 (en) Optimized navigable key-value store
KR20160100211A (en) Method and device for constructing on-line real-time updating of massive audio fingerprint database
EP3264291A1 (en) Data block processing method and device
CN111143368A (en) Relational database data comparison method and system
CN105556474A (en) Managing memory and storage space for a data operation
US20200125584A1 (en) Reorganization of partition by growth space with lob columns
US20110145255A1 (en) Systems and methods for distribution of data in a database index
US10621173B2 (en) Data processing device, data processing method, and recording medium
CN115840731A (en) File processing method, computing device and computer storage medium
CN111858607A (en) Data processing method and device, electronic equipment and computer readable medium
CN114490375A (en) Method, device and equipment for testing performance of application program and storage medium
CN117235069A (en) Index creation method, data query method, device, equipment and storage medium
CN110222046B (en) List data processing method, device, server and storage medium
CN114610959B (en) Data processing method, device, equipment and storage medium
CN110851437A (en) Storage method, device and equipment
CN117725066A (en) Partition data merging method, medium and computer equipment of database
US20160232187A1 (en) Dump analysis method, apparatus and non-transitory computer readable storage medium
US7996366B1 (en) Method and system for identifying stale directories
US20090276603A1 (en) Techniques for efficient dataloads into partitioned tables
WO2016117032A1 (en) Database system, computer system, and database management method
CN116149945A (en) Method, storage medium and device for collecting database statistical information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination