CN118113706A - Statistical information updating method, database system and computing device cluster - Google Patents
Statistical information updating method, database system and computing device cluster Download PDFInfo
- Publication number
- CN118113706A CN118113706A CN202211474724.0A CN202211474724A CN118113706A CN 118113706 A CN118113706 A CN 118113706A CN 202211474724 A CN202211474724 A CN 202211474724A CN 118113706 A CN118113706 A CN 118113706A
- Authority
- CN
- China
- Prior art keywords
- statistical information
- node
- database system
- identification information
- redo log
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 55
- 230000015654 memory Effects 0.000 claims description 39
- 238000004590 computer program Methods 0.000 claims description 9
- 230000006870 function Effects 0.000 description 17
- 238000004891 communication Methods 0.000 description 11
- 206010047289 Ventricular extrasystoles Diseases 0.000 description 9
- 230000008569 process Effects 0.000 description 8
- 238000005129 volume perturbation calorimetry Methods 0.000 description 8
- 238000004364 calculation method Methods 0.000 description 7
- 238000010586 diagram Methods 0.000 description 7
- 238000005457 optimization Methods 0.000 description 5
- 238000012545 processing Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 3
- 230000009467 reduction Effects 0.000 description 3
- 238000012885 constant function Methods 0.000 description 2
- 230000002093 peripheral effect Effects 0.000 description 2
- 239000007787 solid Substances 0.000 description 2
- 230000002159 abnormal effect Effects 0.000 description 1
- 238000009825 accumulation Methods 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000000802 evaporation-induced self-assembly Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000002688 persistence Effects 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/23—Updating
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2462—Approximate or statistical queries
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Probability & Statistics with Applications (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Fuzzy Systems (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The application provides a statistical information updating method, a database system and a computing device cluster, and belongs to the technical field of databases. The method is applied to a master and slave cluster database system of a shared storage architecture, and comprises the following steps: updating statistical information of a table by a main node of a database system; the main node updates the statistical information of the table to a system table, and the system table is stored in a shared storage system of the database system; the master node providing the identification information of the table to one or more backup nodes of the database system; and the standby node acquires the statistical information of the table from the system table based on the identification information of the table. The method can improve the updating speed of the statistical information, so that when the quantity of the statistical information updating task is large, the statistical information is updated more quickly and timely, and meanwhile, the efficient synchronization of the statistical information between the main and the standby nodes in the main and the standby cluster database system of the shared storage architecture can be realized.
Description
Technical Field
The present application relates to the field of database technologies, and in particular, to a statistical information updating method, a database system, a computing device cluster, a computer readable storage medium, and a computer program product.
Background
The statistics of the database describe the distribution of the data in the table, such as data row, index discrimination, etc. In the running process of the database, the query optimization engine of the database needs to calculate an execution plan with the minimum cost according to the statistical information for executing the query operation, and the incorrect statistical information can cause the query optimization engine to not calculate the optimal execution plan, so that the query efficiency is low.
When a user performs operations such as adding, deleting, changing and the like on data in the table, the statistical information of the table also changes, and the database is required to recalculate and update the statistical information so as to ensure that the current statistical information is not out of date and inaccurate. For a database cluster with a high available architecture, the user query may be issued to the master node or the standby node, and the database cluster must ensure that the statistical information of the standby node can be consistent with the statistical information of the master node in time, otherwise, the efficiency of two queries of the user is inconsistent under the same condition, and the application of the client is abnormal.
Disclosure of Invention
The application provides a statistical information updating method, which can improve the updating speed of statistical information, so that when the task amount of table statistical information is larger, the statistical information is updated more quickly and timely, and meanwhile, the efficient synchronization of the statistical information between the main and the standby nodes in the main and the standby cluster database system of a shared storage architecture can be realized. The application also provides a corresponding database system, a computing device cluster, a computer readable storage medium and a computer program product.
In a first aspect, the present application provides a method for updating statistical information. The method can be applied to a master-slave cluster database system of a shared storage architecture, wherein the database system comprises a shared storage system and a plurality of nodes. The nodes comprise a main node and a plurality of standby nodes, the main node can provide read service and write service for users, the standby nodes can provide read-only service for users, and meanwhile, the nodes realize sharing of data, redo logs (redo logs), system tables and other storage information through a shared storage system.
Specifically, the method comprises the following steps: firstly, a main node of a database system updates the statistical information of a table, updates the statistical information of the table to a system table in a shared storage system, simultaneously provides the identification information of the table for a standby node of the database system, and then the standby node acquires the statistical information of the table from the system table in the shared storage system based on the identification information of the table.
In the method, the main node provides the identification information of the table after the statistical information is updated for the standby node, and the standby node can directly acquire the latest statistical information of the table from the system table in the shared storage system based on the identification information of the table, so that the active synchronization of the statistical information between the main node and the standby node in the shared storage architecture database system is realized.
In the scheme shown in the application, the method for updating the statistical information of the table by the master node comprises the following steps: and (5) updating the statistical information of the table by the master node at intervals of target time. Wherein the target time is updated based on the update of the table statistical information update task amount.
In one possible implementation, the specific way to achieve the update of the target time based on the update of the table statistics update task amount is: and inputting the table statistical information updating task quantity into a target time calculation function to calculate so as to obtain target time which is suitable for the current table statistical information updating task quantity. Wherein the target time calculation function is a strict subtraction function in the global range or a piecewise subtraction function containing constant function segments.
By adopting the method, the self-adaptive updating of the statistical information is realized, when the task quantity of updating the table statistical information is larger, a shorter target time can be selected, and the updating frequency of the table statistical information is improved, so that the statistical information is updated more quickly and timely.
In the scheme shown in the application, one possible implementation way of providing the table identification information to the standby node by the main node is as follows: the master node provides a redo log (redo log) to a shared storage system of the database system, and the backup node obtains the redo log from the shared storage system, wherein the redo log records the identification information of the table.
In another possible implementation manner, the specific manner of providing the table identification information to the standby node by the master node is: the master node sends a redo log to one or more backup nodes of the database system, wherein the redo log records table identification information.
In the solution of the present application, the identification information of the table may be a table ID (identity document, ID) of the table.
In the scheme shown in the application, the specific mode for realizing that the standby node acquires the statistical information of the table from the system table in the shared storage system based on the identification information of the table comprises the following steps: and marking the table with the identification information in the memory of the standby node as an old table according to the identification information of the table by the standby node, wherein the marked old table represents the statistical information of the table which needs to be acquired from the system table.
In one possible implementation, marking the table with the identification information in the memory as the old table by the standby node may be performed asynchronously by a background thread of the standby node.
By adopting the mode, the efficient synchronization of the statistical information between the main and standby nodes in the main and standby cluster database system of the shared storage architecture is realized, the synchronization efficiency of the statistical information can be improved, the lock conflict probability of the table is reduced, and the main and standby time delay caused by the lock is relieved.
In a second aspect, the present application provides a database system comprising a shared storage system and a plurality of nodes, the plurality of nodes comprising a master node and a plurality of backup nodes, wherein:
A master node for updating statistical information of the table;
The main node is used for updating the statistical information of the table to a system table, and the system table is stored in a shared storage system of the database system;
A master node for providing the identification information of the table to one or more backup nodes of the database system;
And the standby node is used for acquiring the statistical information of the table from the system table based on the identification information of the table.
In one possible implementation, the master node is specifically configured to provide a redo log (redo log) to a shared storage system of the database system; the backup node is specifically configured to obtain a redo log from the shared storage system, where the redo log records identification information of the table.
In one possible implementation, the master node is specifically configured to send a redo log to one or more backup nodes of the database system, where the redo log records identification information of the table.
In one possible implementation, the master node is specifically configured to update the statistics of the table at intervals of the target time.
In one possible implementation, the master node is further configured to update the task volume based on the table statistics, and update the target time.
In a possible implementation manner, the standby node is further configured to mark, according to the identification information of the table, a table in the memory of the standby node, which has the identification information, as an old table, where the marked old table represents statistical information of the table that needs to be obtained from the system table.
In a third aspect, the present application provides a cluster of computing devices, comprising at least one computing device, each computing device comprising a processor and a memory; the processor of the at least one computing device is configured to execute instructions stored in the memory of the at least one computing device to cause the cluster of computing devices to perform a method of statistical information updating as provided by the first aspect or any of the possible implementations of the first aspect.
In a fourth aspect, the application provides a computer program product comprising instructions which, when executed by a cluster of computing devices, cause the cluster of computing devices to perform a method of updating statistical information as provided by the above-described first aspect or any of the possible implementations of the first aspect.
In a fifth aspect, the application provides a computer readable storage medium comprising computer program instructions which, when executed by a cluster of computing devices, perform a method of statistical information updating as provided by the first aspect or any of the possible implementations of the first aspect.
Drawings
Fig. 1 is a schematic diagram of a database system according to an embodiment of the present application.
Fig. 2 is a flowchart of a statistical information updating method according to an embodiment of the present application.
Fig. 3 is a flowchart of step S201 in a statistical information updating method according to an embodiment of the present application.
Fig. 4 is a flowchart of step S204 in a statistical information updating method according to an embodiment of the present application.
FIG. 5 is a schematic diagram of a computing device provided by an embodiment of the present application.
FIG. 6 is a schematic diagram of a computing device cluster provided by an embodiment of the application.
FIG. 7 is a schematic diagram of an implementation of a computing device cluster according to an embodiment of the present application.
Detailed Description
The technical scheme provided by the application will be described in detail below with reference to the accompanying drawings.
In order to make the technical scheme provided by the application clearer, before the technical scheme provided by the application is specifically described, explanation of related terms is firstly carried out.
(1) Statistical information: the information describing the distribution of the data in the table, such as the number of rows, the number of blocks, the average size of each row, the number of data rows, the index distinction degree and the like, is the source data of the query execution plan generated by the query optimization engine in the database. Failure to collect or stale statistics often results in degradation of the query execution plan generated by the query optimization engine, making database queries inefficient. When the user modifies the data in the table through operations such as adding, deleting and changing, the statistical information of the table also changes, and the database needs to recalculate and update the statistical information of the table so as to ensure that the query optimization engine generates an optimal query execution plan and ensure the performance of the database.
(2) Redo log (redo log): is a transaction log in the database, which records the transaction operation change, and records the updated content of the data page. When a database fails, the database can be restored to a pre-failure state using redox logs.
The application scenario according to the embodiment of the present application is briefly described below.
Fig. 1 is a schematic diagram of a database system according to an embodiment of the present application. As shown in fig. 1, a database system 100 provided in an embodiment of the present application is a master/slave cluster database system adopting a shared storage architecture, and includes a shared storage system 103 and a plurality of nodes, where the plurality of nodes includes a master node 101 and a plurality of slave nodes, the master node 101 may provide a read service and a write service for a user, and the plurality of slave nodes may provide a read-only service for the user, and two slave nodes 102-1 and 102-2 are shown in an exemplary manner in the figure.
The nodes are used for deploying database examples, one database example corresponds to one database, and a user can operate the database through the database examples. Wherein, a primary instance 111 of the database instance is deployed on the primary node 101, a standby instance 112-1 of the database instance is deployed on the standby node 102-1, and a standby instance 112-2 of the database instance is deployed on the standby node 102-2.
Wherein a plurality of nodes share storage information including table data, redo log (redo log), and system table through the shared storage system 103. On the one hand, after updating the statistics information of the table, the master node 101 updates the statistics information to the system table in the shared storage system 103 to perform persistence of the statistics information, and a plurality of standby nodes can acquire the updated statistics information of the table from the system table; on the other hand, after the master node 101 generates the redox log, the redox log is stored in the shared storage system 103, and the plurality of standby nodes can read the redox log from the shared storage system 103, and play back the redox log to perform data recovery, so as to realize synchronization with the master node.
In an embodiment of the present application, the database system may be a cloud-native database system or a conventional database system. The nodes may be physical nodes such as computers or logical nodes on the computers. Computers include, but are not limited to, devices such as terminals or servers. Logical nodes on a computer may be obtained by virtualization, e.g., logical nodes on a computer may be virtual machines or containers on a computer.
In current database systems, the updating of the statistics is typically done at a fixed frequency. When the user thread finds that the statistical information of the table needs to be updated, the task is submitted to a task pool of a background thread and wakes up the background thread, and the background thread periodically acquires a table statistical information updating task from the task pool at fixed time intervals (for example, 10 seconds), and recalculates and updates the statistical information of the table. However, when there are many tables in the database at the same time and a large amount of changes to the table data occur, the above method cannot update the statistical information in time, resulting in low query efficiency of the database system.
In addition, when data in the table is changed, the synchronization of the statistical information between the master node and the standby node is usually dependent on passive triggering of logic log playback in the data synchronization process, and does not actively synchronize with respect to updating of the statistical information. When the main and standby cluster database system adopts a shared storage architecture, the main node and the standby node share one data, so that the method cannot realize the synchronization of the statistical information between the main node and the standby node.
Therefore, the application provides a statistical information updating method which can be applied to the master and slave cluster database system adopting the shared storage architecture shown in fig. 1. According to the method, the self-adaptive updating of the statistical information is realized by updating the frequency of the table statistical information based on the table statistical information updating task quantity, so that the updating speed of the statistical information can be improved, and the statistical information is updated more quickly and timely when a large number of statistical information updating tasks are accumulated; meanwhile, after the statistical information of the table is updated by the master node, all the slave nodes are informed through a redo log (redo log) stored in a shared storage system of the database system, so that the synchronization of the statistical information between the master node and the slave node in the master-slave cluster database system of the shared storage architecture is realized, the synchronization efficiency of the statistical information between the master node and the slave node can be improved, the lock conflict probability of the table is reduced, and the master-slave delay caused by lock is relieved.
Next, the statistical information updating method provided by the present application is described with reference to the flowchart shown in fig. 2, and the method is based on the active/standby cluster database system using the shared storage architecture shown in fig. 1, and specifically includes the following steps S201 to S204.
Step S201, the main node of the database system updates the statistical information of the table.
In one possible implementation, step S201 may be implemented by the following steps S2011 and S2022.
And step S2011, the master node updates the task amount and the target time based on the table statistical information.
In the conventional technology, when a user thread finds that the statistical information of a table needs to be updated, the user thread submits the task to a task pool of a background thread and wakes the background thread, and the background thread periodically acquires a table statistical information updating task from the task pool at fixed time intervals, and recalculates and updates the statistical information of the table. When a user thread submits a large number of table statistical information updating tasks in a short time, the conventional technology can cause accumulation of the table statistical information updating tasks, and the statistical information of the table cannot be updated in time, so that the subsequent query efficiency of the user is low.
In the application, after the user modifies the data of the table in the database, the table statistical information updating task pool of the main node is added with a table statistical information updating task corresponding to the table, and all the table statistical information updating tasks which are not processed currently are stored in the table statistical information updating task pool. The table statistics update task amount is the total amount of table statistics update tasks in the task pool.
In practical application, a plurality of table statistical information updating tasks usually exist in a task pool at a certain moment, and a target main node processes the table statistical information updating tasks at a certain frequency, namely after finishing the last table statistical information updating task, a period of time is reserved, and then processing of the current table statistical information updating task is started. The target time is the time between starting to process the current table statistical information updating task after finishing the last table statistical information updating task, and is updated along with the updating of the table statistical information updating task amount.
Specifically, the update target time may be achieved by:
and inputting the table statistical information updating task quantity into a target time calculation function to calculate so as to obtain target time which is suitable for the current table statistical information updating task quantity. The target time calculation function may be expressed as:
y=f(x)
wherein x is the table statistical information update task amount, y is the target time, and f () is the functional relationship between the target time and the table statistical information update task amount.
Wherein the target time calculation function is a subtraction function. The reduction function may be a strict reduction function in the global range, or may be a piecewise reduction function including constant function segments, which is not particularly limited herein.
For example, the target time calculation function may take the form of:
When the target time calculation function adopts the above form, the update mode of the target time is as follows: if the table statistical information updating task amount is 10, the target time is 10 seconds, and the main node starts to process the current table statistical information updating task at intervals of 10 seconds after finishing the last table statistical information updating task; if the table statistical information updating task amount is 100, the target time is 1 second, and the target main node starts to process the current table statistical information updating task 1 second after finishing the last table statistical information updating task; if the table statistical information updating task amount is 1000, the target time is 0.1 second, and the target main node starts to process the current table statistical information updating task at intervals of 0.1 second after finishing the last table statistical information updating task; and the target time is 10 seconds at the maximum and 0.1 seconds at the minimum.
Step 2012, the master node updates the statistical information of the table at intervals of the target time.
Specifically, after the interval target time, the master node acquires a table statistical information update task from the table statistical information update task pool, updates the statistical information of the table corresponding to the table statistical information update task, and updates the table statistical information update task amount after the statistical information update of the table is completed.
Step S202, the main node updates the statistical information of the table to the system table.
Wherein the system table is stored in a shared storage system of the database system.
Step S203, the master node provides the table identification information to one or more backup nodes of the database system.
Wherein the identification information of the table may be a table ID (identity document, ID) of the table.
In one possible implementation, the primary node provides the table identification information to one or more backup nodes of the database system in such a way that: the master node provides a redo log (redo log) to a shared storage system of the database system, and the backup node obtains the redo log (redo log) from the shared storage system, wherein the redo log (redo log) records identification information of a table.
Specifically, the master node may record the identification information of the table into the redox log, and store the redox log in the shared storage system of the database system, and the backup node obtains the identification information of the table in the redox log by reading and analyzing the redox log in the shared storage system.
In another possible implementation, the main node provides the table identification information to one or more standby nodes of the database system in such a way that: the master node sends a redo log (redo log) to a backup node of the database system, wherein the redo log (redo log) records the identification information of the table.
Step S204, the standby node acquires the statistical information of the table from the system table based on the identification information of the table.
In a specific implementation, a specific process of the standby node acquiring the statistical information of the table from the system table based on the identification information of the table may include the following steps S2041 to S2043:
step S2041, the standby node judges whether the table is cached in the memory of the standby node based on the identification information of the table, if the table is cached in the memory of the standby node, step S2042 is executed, and if the table is not cached in the memory of the standby node, step S2043 is executed.
Step S2042, the standby node marks the table with the identification information in the memory as an old table, wherein the marked old table indicates that the statistical information of the table needs to be obtained in the system table. When the user reads the table through the standby node, the standby node reads the statistical information of the table from the system table according to the mark and updates the statistical information into the memory of the standby node.
Specifically, marking, by the standby node, the table having the identification information in the memory as the old table may be asynchronously completed by a background thread of the standby node, including: the standby node firstly creates a corresponding table marking task for a table with the identification information in the memory, and puts the table marking task into a table marking task pool of the standby node, and wakes up a background thread of the standby node; after a background thread of the standby node is awakened, acquiring a table marking task from a table marking task pool; and marking the table with the identification information in the memory as an old table by the background thread of the standby node.
In one possible implementation, after determining that the table is cached in the memory based on the identification information of the table, the standby node may immediately read the statistical information of the table from the system table and update the statistical information into the memory of the standby node.
Step S2043, when the user reads the table through the standby node, the standby node directly reads the statistical information of the table from the system table and updates the statistical information into the memory of the standby node.
The present application also provides a database system, which may be the database system shown in fig. 1. The database system comprises a shared storage system and a plurality of nodes, wherein the plurality of nodes comprise a main node and a plurality of standby nodes, and the database system comprises:
A master node for updating statistical information of the table;
The main node is used for updating the statistical information of the table to a system table, and the system table is stored in a shared storage system of the database system;
A master node for providing the identification information of the table to one or more backup nodes of the database system;
And the standby node is used for acquiring the statistical information of the table from the system table based on the identification information of the table.
In one possible implementation, the master node is specifically configured to provide a redo log (redo log) to a shared storage system of the database system; the backup node is specifically configured to obtain a redo log from the shared storage system, where the redo log records identification information of the table.
In one possible implementation, the master node is specifically configured to send a redo log to one or more backup nodes of the database system, where the redo log records identification information of the table.
In one possible implementation, the master node is specifically configured to update the statistics of the table at intervals of the target time.
In one possible implementation, the master node is further configured to update the task volume based on the table statistics, and update the target time.
In a possible implementation manner, the standby node is further configured to mark, according to the identification information of the table, a table in the memory of the standby node, which has the identification information, as an old table, where the marked old table represents statistical information of the table that needs to be obtained from the system table.
Specifically, for specific implementation of various operations of the statistical information updating method performed by the database system provided in the foregoing embodiment, reference may be made to descriptions of relevant content in the foregoing statistical information updating method embodiment, which are not described herein again.
In the embodiment of the present application, the database system may be a cloud-native database system or a conventional database system. Nodes in the database system may be implemented in software or in hardware. The implementation of the node is described below.
The node may include code that runs on a computing instance. Wherein the computing instance may be at least one of a physical host (computing device), a virtual machine, a container, etc. computing device. Further, the computing device may be one or more. For example, a node may include code running on multiple hosts, or virtual machines, or containers. It should be noted that, multiple hosts, or virtual machines, or containers for running the code may be distributed in the same AZ, or may be distributed in different AZs, where each AZ includes a data center or multiple data centers that are geographically close. Multiple hosts, or virtual machines, or containers for running the code may be distributed in the same region, or may be distributed in different regions. Where typically a region may comprise a plurality of AZ's, VPC's are provided within a region. The inter-region communication between two VPCs in the same region and between VPCs in different regions needs to set a communication gateway in each VPC, and the interconnection between the VPCs is realized through the communication gateway.
Also, multiple hosts or virtual machines, or containers, for running the code may be distributed in the same VPC, or may be distributed among multiple VPCs. Wherein typically a region may comprise a plurality of AZs.
A node may include at least one computing device, such as a server or the like. Alternatively, the node may be a device implemented using an ASIC or PLD, or the like. Wherein, the PLD can be CPLD, FPGA, GAL or any combination thereof.
Multiple computing devices included in a node may be distributed among the same AZ or among different AZ. Multiple computing devices included in a node may be distributed in the same region or may be distributed in different regions. Likewise, multiple computing devices included in a node may be distributed in the same VPC or may be distributed among multiple VPCs. Wherein the plurality of computing devices may be any combination of computing devices such as servers, ASIC, PLD, CPLD, FPGA, and GAL.
The present application also provides a computing device 500, and fig. 5 is a schematic structural diagram of the computing device 500. The computing device 500 may be a master node or a standby node in the above embodiments, including: bus 502, processor 504, memory 506, and communication interface 508. Communication between processor 504, memory 506, and communication interface 508 is via bus 502. Computing device 500 may be a server or a terminal device. It should be understood that the present application is not limited to the number of processors and memories in computing device 500.
Bus 502 may be a peripheral component interconnect standard (PERIPHERAL COMPONENT INTERCONNECT, PCI) bus, or an extended industry standard architecture (extended industry standard architecture, EISA) bus, among others. The buses may be divided into address buses, data buses, control buses, etc. For ease of illustration, only one line is shown in fig. 5, but not only one bus or one type of bus. Bus 502 may include a path to transfer information between various components of computing device 500 (e.g., memory 506, processor 504, communication interface 508).
The processor 504 may include any one or more of a central processing unit (central processing unit, CPU), a graphics processor (graphics processing unit, GPU), a Microprocessor (MP), or a digital signal processor (DIGITAL SIGNAL processor, DSP).
The memory 506 may include volatile memory (RAM), such as random access memory (random access memory). The memory 506 may also include a non-volatile memory (ROM), such as a read-only memory (ROM), a flash memory, a mechanical hard disk (HARD DISK DRIVE, HDD), or a Solid State Disk (SSD) STATE DRIVE.
The memory 506 stores executable program codes, and the processor 504 executes the executable program codes to implement the functions of the plurality of nodes in the database system, respectively, so as to implement the statistical information updating method. That is, the memory 506 has stored thereon instructions for performing the statistical information updating method.
Communication interface 508 enables communication between computing device 100 and other devices or communication networks using a transceiver module such as, but not limited to, a network interface card, transceiver, or the like.
The present application also provides a computing device cluster, and fig. 6 is a schematic diagram of the computing device cluster, where the computing device cluster may implement the database system in the foregoing embodiments. As shown in fig. 6, the computing device cluster includes at least one computing device 500 as shown in fig. 5, where the computing device 500 may be a server, such as a central server, an edge server, or a local server in a local data center. In some embodiments, the computing device may also be a terminal device such as a desktop, notebook, or smart phone.
Instructions for performing the statistical information updating method may be stored in memory 506 in one or more computing devices 500 in the computing device cluster. The instructions, when executed by at least one computing device in a cluster of computing devices, may cause the cluster of computing devices to implement a method of statistical information updating as described in the embodiments.
In some possible implementations, portions of instructions for performing the statistical information updating method may also be stored separately in the memory 506 of one or more computing devices 500 in the computing device cluster. In other words, a combination of one or more computing devices 500 may collectively execute instructions for performing a statistical information updating method.
It should be noted that the memory 506 in different computing devices 500 in the computing device cluster may store different instructions for performing part of the functions of the database system, respectively. That is, the instructions stored by the memory 506 in the different computing devices 500 may implement the functionality of one or more of the master node and the plurality of standby nodes.
In some possible implementations, one or more computing devices in a cluster of computing devices may be connected through a network. Wherein the network may be a wide area network or a local area network, etc. Fig. 7 shows one possible implementation. As shown in fig. 7, two computing devices 500A and 500B are connected by a network. Specifically, the connection to the network is made through a communication interface in each computing device. In this type of possible implementation, the memory 506 in the computing device 500A has stored therein instructions for performing the functions of the master node 101 of the database system. Meanwhile, instructions to perform the functions of standby node 102-1 and standby node 102-2 are stored in memory 506 in computing device 500B.
The manner of connection between clusters of computing devices shown in fig. 7 may be in view of the large amount of stored data required by the statistical information updating method provided by the present application, and thus, in view of the functionality implemented by the standby node 102-1 and the standby node 102-2 being performed by the computing device 500B.
It should be appreciated that the functionality of computing device 500A shown in fig. 7 may also be performed by multiple computing devices 500. Likewise, the functionality of computing device 500B may also be performed by multiple computing devices 500.
The application also provides a computer program product containing instructions. The computer program product may be software or a program product containing instructions capable of running on a computing device or stored in any useful medium. The computer program product, when run on at least one computing device, causes the at least one computing device to perform a statistical information updating method.
The application also provides a computer readable storage medium. The computer readable storage medium may be any available medium that can be stored by a computing device or a data storage device such as a data center containing one or more available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid state disk), etc. The computer-readable storage medium includes instructions that instruct a computing device to perform a statistical information updating method or instruct a computing device to perform a statistical information updating method.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; these modifications or substitutions do not depart from the essence of the corresponding technical solutions from the protection scope of the technical solutions of the embodiments of the present invention.
Claims (13)
1. A method of updating statistical information, the method comprising:
updating statistical information of a table by a main node of a database system;
the main node updates the statistical information of the table to a system table, and the system table is stored in a shared storage system of the database system;
The master node providing the identification information of the table to one or more backup nodes of the database system;
And the standby node acquires the statistical information of the table from the system table based on the identification information of the table.
2. The method according to claim 1, wherein the master node provides the identification information of the table to one or more backup nodes of the database system, in particular:
The master node providing a redo log (redo log) to the shared storage system of the database system, after which the backup node obtains the redo log from the shared storage system, the redo log recording identification information of the table;
Or the master node sends a redo log to one or more standby nodes of the database system, wherein the redo log records the identification information of the table.
3. The method according to claim 1 or 2, wherein the updating of the statistics of the table by the master node of the database system comprises:
And the main node updates the statistical information of the table at intervals of target time.
4. A method according to claim 3, characterized in that the method comprises:
And the master node updates the task amount and the target time based on the table statistical information.
5. The method according to any one of claims 1 to 4, characterized in that it comprises:
And marking the table with the identification information in the memory of the standby node as an old table according to the identification information of the table by the standby node, wherein the marked old table represents the statistical information of the table which needs to be acquired from the system table.
6. A database system comprising a shared storage system and a plurality of nodes, the plurality of nodes comprising a master node and a plurality of backup nodes, wherein:
The main node is used for updating the statistical information of the table;
the master node is configured to update statistical information of the table to a system table, where the system table is stored in a shared storage system of the database system;
The master node is configured to provide identification information of the table to one or more backup nodes of the database system;
The standby node is configured to obtain statistical information of the table from the system table based on the identification information of the table.
7. The database system of claim 6, wherein the database system comprises a plurality of data structures,
The master node is specifically configured to provide a redo log (redo log) to the shared storage system of the database system, or send a redo log to one or more backup nodes of the database system, where the redo log records identification information of the table;
the backup node is specifically configured to obtain the redo log from the shared storage system, where the redo log records identification information of the table.
8. Database system according to claim 6 or 7, characterized in that the master node is specifically adapted to update the statistics of the table at intervals of a target time.
9. The database system of claim 8, wherein the master node is further configured to update the target time based on table statistics to update the amount of tasks.
10. The database system according to any of claims 6 to 9, wherein the backup node is further configured to label a table having the identification information in the memory of the backup node as an old table according to the identification information of the table, the label as the old table representing statistical information of the table to be acquired from the system table.
11. A cluster of computing devices, comprising at least one computing device, each computing device comprising a processor and a memory;
The processor of the at least one computing device is configured to execute instructions stored in the memory of the at least one computing device to cause the cluster of computing devices to perform the statistical information updating method of claims 1 to 5.
12. A computer program product containing instructions that, when executed by a cluster of computing devices, cause the cluster of computing devices to perform the statistical information updating method of claims 1 to 5.
13. A computer readable storage medium comprising computer program instructions which, when executed by a cluster of computing devices, perform the statistical information updating method of claims 1 to 5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211474724.0A CN118113706A (en) | 2022-11-23 | 2022-11-23 | Statistical information updating method, database system and computing device cluster |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211474724.0A CN118113706A (en) | 2022-11-23 | 2022-11-23 | Statistical information updating method, database system and computing device cluster |
Publications (1)
Publication Number | Publication Date |
---|---|
CN118113706A true CN118113706A (en) | 2024-05-31 |
Family
ID=91216399
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211474724.0A Pending CN118113706A (en) | 2022-11-23 | 2022-11-23 | Statistical information updating method, database system and computing device cluster |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN118113706A (en) |
-
2022
- 2022-11-23 CN CN202211474724.0A patent/CN118113706A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108121782B (en) | Distribution method of query request, database middleware system and electronic equipment | |
CN111274252B (en) | Block chain data uplink method and device, storage medium and server | |
US20140032496A1 (en) | Information storage system and data replication method thereof | |
US20180004777A1 (en) | Data distribution across nodes of a distributed database base system | |
US6968382B2 (en) | Activating a volume group without a quorum of disks in the volume group being active | |
US12013758B2 (en) | Methods and systems for power failure resistance for a distributed storage system | |
US11960442B2 (en) | Storing a point in time coherently for a distributed storage system | |
CN112084161A (en) | Database-based data processing method and device and readable storage medium | |
CN109726211B (en) | Distributed time sequence database | |
CN114168636A (en) | Data processing method, device and equipment based on cache consistency and storage medium | |
US11134121B2 (en) | Method and system for recovering data in distributed computing system | |
EP3377970B1 (en) | Multi-version removal manager | |
US10073874B1 (en) | Updating inverted indices | |
CN114785662B (en) | Storage management method, device, equipment and machine-readable storage medium | |
CN118113706A (en) | Statistical information updating method, database system and computing device cluster | |
CN115858522A (en) | Local compression of tree-based index structures | |
CN115309336A (en) | Data writing method, cache information updating method and related device | |
CN114895850A (en) | Method for optimizing writing of data lake | |
CN113495896A (en) | Management method, device, equipment and medium of transaction processing system | |
CN113918531A (en) | Data synchronization method and device of distributed table system and server equipment | |
CN118277344B (en) | Storage node interlayer merging method and device of distributed key value storage system | |
CN112860694B (en) | Service data processing method, device and equipment | |
CN113722338B (en) | Data synchronization method, data synchronization device, electronic equipment and storage medium | |
US11150827B2 (en) | Storage system and duplicate data management method | |
CN118245446A (en) | Method, device and storage medium for managing metadata |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication |