CN117009346A - Database table structure changing method, device, equipment and storage medium - Google Patents

Database table structure changing method, device, equipment and storage medium Download PDF

Info

Publication number
CN117009346A
CN117009346A CN202211221538.6A CN202211221538A CN117009346A CN 117009346 A CN117009346 A CN 117009346A CN 202211221538 A CN202211221538 A CN 202211221538A CN 117009346 A CN117009346 A CN 117009346A
Authority
CN
China
Prior art keywords
data
data ranges
ranges
database
parallel
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211221538.6A
Other languages
Chinese (zh)
Inventor
叶盛
潘安群
雷海林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202211221538.6A priority Critical patent/CN117009346A/en
Publication of CN117009346A publication Critical patent/CN117009346A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/214Database migration support
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The application provides a database table structure changing method, a device, equipment and a storage medium, which can be applied to various scenes such as cloud technology, artificial intelligence, intelligent traffic and the like, and the method comprises the following steps: determining a first table of which the table structure is to be changed in the database in response to a table structure change instruction, wherein the table structure change instruction is used for indicating to change the table structure of the first table into the table structure of a second table; determining a storage mode of data in a first table in a database; based on a storage mode, N first data ranges corresponding to the first table are determined, data included in the N first data ranges form data in the first table, and N is a positive integer greater than 1; and parallelly migrating the data in the N first data ranges to a second table through P preset parallel resources, wherein P is a positive integer greater than 1. Namely, the embodiment of the application transfers the data in the N first data ranges of the first table to the second table in parallel through the P parallel resources, thereby realizing the rapid change of the table structure and further ensuring the normal operation of the database service.

Description

Database table structure changing method, device, equipment and storage medium
Technical Field
The embodiment of the application relates to the technical field of databases, in particular to a method, a device, equipment and a storage medium for changing a database table structure.
Background
A TABLE (TABLE) is an object in a database that is used to store data, is a collection of structured data, and is the basis for the entire database system. Table structure change is the basic function of a database, i.e. changing a table of one structure to a table of another structure.
Table structure changes can take time, especially for large tables, and can be a very time consuming operation. For some large tables, it usually takes hours or even days to make a table structure change, and this period of time is very likely to overlap with the peak of service, which affects the normal operation of the service. Therefore, how to improve the rapid change of the table structure has been the focus of research by database practitioners.
Disclosure of Invention
The embodiment of the application provides a method, a device, equipment and a storage medium for changing a table structure of a database, which are used for improving the change speed of the table structure.
In a first aspect, an embodiment of the present application provides a method for changing a database table structure, including:
determining a first table of which the table structure is to be changed in a database in response to a table structure change instruction, wherein the table structure change instruction is used for indicating to change the table structure of the first table into the table structure of a second table;
Determining a storage mode of the data in the first table in the database;
based on the storage mode, N first data ranges corresponding to the first table are determined, data included in the N first data ranges form data in the first table, and N is a positive integer greater than 1;
and parallelly migrating the data in the N first data ranges to the second table through P preset parallel resources, wherein P is a positive integer greater than 1.
In a second aspect, an embodiment of the present application provides a method for changing a database table structure, including:
a table determining unit, configured to determine a first table of a table structure to be changed in a database in response to a table structure changing instruction, where the table structure changing instruction is configured to instruct to change the table structure of the first table to a table structure of a second table;
a storage mode determining unit, configured to determine a storage mode of the data in the first table in the database;
the processing unit is used for determining N first data ranges corresponding to the first table based on the storage mode, wherein data included in the N first data ranges form data in the first table, and N is a positive integer greater than 1;
And the changing unit is used for parallelly migrating the data in the N first data ranges to the second table through preset P parallel resources, wherein P is a positive integer greater than 1.
In a third aspect, an embodiment of the present application provides an electronic device, including a processor and a memory. The memory is used for storing a computer program, and the processor is used for calling and running the computer program stored in the memory to execute the method in the first aspect.
In a fourth aspect, embodiments of the present application provide a chip for implementing the method in any one of the first aspect or each implementation manner thereof. Specifically, the chip includes: a processor for calling and running a computer program from a memory, causing a device on which the chip is mounted to perform the method as in the first aspect described above.
In a fifth aspect, an embodiment of the present application provides a computer-readable storage medium storing a computer program, where the computer program causes a computer to execute the method in the first aspect.
In a sixth aspect, embodiments of the present application provide a computer program product comprising computer program instructions for causing a computer to perform the method of the first aspect described above.
In a seventh aspect, embodiments of the present application provide a computer program which, when run on a computer, causes the computer to perform the method of the first aspect described above.
In summary, in an embodiment of the present application, a computing device determines, in response to a table structure change instruction, a first table of a table structure to be changed in a database, where the table structure change instruction is used to instruct to change the table structure of the first table to a table structure of a second table; determining a storage mode of data in a first table in a database; based on a storage mode, N first data ranges corresponding to the first table are determined, data included in the N first data ranges form data in the first table, and N is a positive integer greater than 1; and parallelly migrating the data in the N first data ranges to a second table through P preset parallel resources, wherein P is a positive integer greater than 1. Namely, the embodiment of the application transfers the data in the N first data ranges of the first table to the second table in parallel through the P parallel resources, thereby realizing the rapid change of the table structure and further ensuring the normal operation of the database service.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of a database system according to an embodiment of the present application;
fig. 2 is a schematic diagram of an application scenario according to an embodiment of the present application;
FIG. 3 is a flowchart illustrating a method for changing a database table structure according to an embodiment of the present application;
FIG. 4 is a schematic diagram of a TDSQL-3.0 system;
FIG. 5 is a schematic diagram illustrating a determination of a first data range according to an embodiment of the present application;
FIG. 6 is a flowchart illustrating a method for changing a database table structure according to an embodiment of the present application;
FIG. 7 is a schematic diagram of information exchange according to an embodiment of the present application;
FIG. 8 is a schematic diagram of another information exchange according to an embodiment of the present application;
FIG. 9 is a schematic diagram of fault detection according to an embodiment of the present application;
FIG. 10 is a schematic diagram of another fault detection according to an embodiment of the present application;
FIG. 11 is a schematic block diagram of a database table structure changing apparatus according to an embodiment of the present application;
FIG. 12 is a schematic block diagram of a computing device provided by an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be described below with reference to the accompanying drawings in the embodiments of the present application.
It should be understood that in embodiments of the present application, "B corresponding to a" means that B is associated with a. In one implementation, B may be determined from a. It should also be understood that determining B from a does not mean determining B from a alone, but may also determine B from a and/or other information.
In the description of the present application, unless otherwise indicated, "a plurality" means two or more than two.
In addition, in order to facilitate the clear description of the technical solution of the embodiments of the present application, in the embodiments of the present application, the words "first", "second", etc. are used to distinguish the same item or similar items having substantially the same function and effect. It will be appreciated by those of skill in the art that the words "first," "second," and the like do not limit the amount and order of execution, and that the words "first," "second," and the like do not necessarily differ.
The database table structure changing method provided by the embodiment of the application can be applied to a database, and is particularly used for changing the structure of the database table.
In order to facilitate understanding of the embodiments of the present application, the following brief description will be first given of related concepts related to the embodiments of the present application:
the Database (Database), which can be considered as an electronic filing cabinet, stores electronic files, and users can perform operations such as adding, inquiring, updating, deleting and the like on the data in the files. A "database" is a collection of data stored together in a manner that can be shared with multiple users, with as little redundancy as possible, independent of the application.
A database system (DBS) is a system composed of a database and management software thereof, and is a more ideal data processing system developed to adapt to the requirement of data processing, and is also a software system for providing data for a storage, maintenance and application system which can actually run, and is an aggregate of a storage medium, a processing object and a management system. As shown in fig. 1, the database system mainly includes a database, a database management system, and a data application system. The database is used for storing data, the database is managed by the database management system in a unified mode, and the insertion, modification and retrieval of the data are all carried out through the database management system. The data manager is responsible for creating, monitoring and maintaining the entire database so that the data can be effectively used by anyone with access rights. The database application system is used to provide a database application, which may be understood as an application that has access to a database. In actual use, the object may send a read-write request of the database to the database management system through an application in the database application system. The database management system performs a data reading operation on the database based on the read-write request of the database, for example, reads data from the database based on the read request sent by the application, and returns the read data to the application. Or writing the data into the database based on the write request sent by the application. Optionally, the application may also implement other operations on the database through the database management system, which is not limited by the embodiment of the present application.
The database management system (Database Management System, abbreviated as DBMS) is a computer software system designed for managing databases, and generally has basic functions of storage, interception, security, backup and the like. The database management system may classify according to the database model it supports, e.g., relational, XML (Extensible Markup Language ); or by the type of computer supported, e.g., server cluster, mobile phone; or by the query language used, such as SQL (structured query language (Structured Query Language), XQuery, or by the energy impact emphasis, such as maximum-scale, maximum-speed, or other classification means, regardless of which classification means is used, some DBMSs can cross-category, for example, while supporting multiple query languages.
In some embodiments, the database stores data in a cloud storage manner. Cloud storage (cloud storage) is a new concept that extends and develops in the concept of cloud computing, and a distributed cloud storage system (hereinafter referred to as a storage system for short) refers to a storage system that integrates a large number of storage devices (storage devices are also referred to as storage nodes) of various types in a network to work cooperatively through application software or application interfaces through functions such as cluster application, grid technology, and a distributed storage file system, so as to provide data storage and service access functions for the outside.
At present, the storage method of the storage system is as follows: when creating logical volumes, each logical volume is allocated a physical storage space, which may be a disk composition of a certain storage device or of several storage devices. The client stores data on a certain logical volume, that is, the data is stored on a file system, the file system divides the data into a plurality of parts, each part is an object, the object not only contains the data but also contains additional information such as a data Identification (ID) and the like, the file system writes each object into a physical storage space of the logical volume, and the file system records storage position information of each object, so that when the client requests to access the data, the file system can enable the client to access the data according to the storage position information of each object.
The process of allocating physical storage space for the logical volume by the storage system specifically includes: physical storage space is divided into stripes in advance according to the set of capacity measures for objects stored on a logical volume (which measures tend to have a large margin with respect to the capacity of the object actually to be stored) and redundant array of independent disks (RAID, redundant Array of Independent Disk), and a logical volume can be understood as a stripe, whereby physical storage space is allocated for the logical volume.
A TABLE (TABLE) is an object in a database that is used to store data, is a collection of structured data, and is the basis for the entire database system. Table structure change is the basic function of a database, i.e. changing a table of one structure to a table of another structure.
The change of the table structure is time-consuming, especially for large tables, and is a very time-consuming operation, for example, it usually takes several hours or even days to wait for a change of the table structure, and this time period is very likely to overlap with the service peak, and affects the normal operation of the service. That is, in the conventional case of changing the table structure of the database, a lot of time is required, and the speed is low. In addition, when the table structure is changed, the table is generally inaccessible, in some cases, if the service layer accesses the table during the process of changing the table structure, access failure may be caused, or the table may be accessed after the table structure of the table is changed, so that the service layer waits for a long time, and service performance is affected.
In order to solve the above technical problems, that is, to improve the speed of changing the table structure of the database so as to ensure the reliability of access to the table, the embodiment of the application provides a method for changing the table structure of the database, which is to determine a first table to be changed in the table structure and determine the storage mode of data in the first table in the database in response to a request for changing the table structure, and exemplary, the data storage modes corresponding to different database types may be different, or the same database may include different data storage modes. Then, based on the storage mode of the data in the first table in the database, determining N first data ranges corresponding to the first table, wherein the first data ranges can be understood as being obtained by dividing the data in the first table. In addition, P parallel resources are preset in the embodiment of the application, and the P parallel resources can be P parallel threads or P parallel processes or P machines. According to the method and the device, the data in the N first data ranges corresponding to the first table are migrated to the second table in parallel through the P parallel resources, so that the parallel migration of the data in the first table is realized. That is, the embodiment of the application transfers the data in the N data ranges of the first table to the second table in parallel through P parallel resources, thereby realizing the rapid change of the table structure and further ensuring the normal operation of the database service. Therefore, the method for changing the database table structure provided by the embodiment of the application can be applied in various application scenes, and the various application scenes can include, but are not limited to, any one or more of the following: cloud technology, artificial intelligence, intelligent transportation, and the like.
The related concepts related to the embodiments of the present application are described above, and the system architecture related to the embodiments of the present application is described below.
Fig. 2 is a schematic diagram of an application scenario according to an embodiment of the present application, including an application client 110, a computing device 120, and a database 130, where the application client 110 is communicatively connected to the computing device 120, and the computing device 120 is communicatively connected to the database 130.
In a specific implementation of an embodiment of the present application, the computing device 120 in fig. 2 may be used to perform the database table structure modification method provided in the embodiment of the present application. The computing device 120 may include, but is not limited to, a terminal device and a server, that is, the computing device 120 may be a terminal device, a server, or a computing system formed by a terminal device and a server, which is not limited in this embodiment of the present application.
In some embodiments, the application client 110 runs on a terminal device, and the application client 110 may be understood as an application client 110 corresponding to the database 130, and a user may implement access to the database 130 through the application client 110.
In the embodiment of the present application, the terminal device may include, but is not limited to: smart phones, tablet computers, notebook computers, desktop computers, vehicle terminals, intelligent voice interaction equipment, intelligent home appliances, aircrafts and the like. In a specific embodiment, a terminal device may further run a wide variety of Applications (APP) and/or clients, such as: multimedia play clients, social clients, browser clients, information flow clients, educational clients, image processing clients, and so forth.
In some embodiments, the terminal device may be further configured with a display device, which may also be a display, a display screen, a touch screen, etc., and the touch screen may also be a touch screen, a touch panel, etc.
In some embodiments, if the computing device 120 is a terminal device, the application client 110 may be directly installed on the computing device 120.
In embodiments of the present application, the server may be one or more servers. If there are multiple servers, there are at least two servers for providing different services and/or there are at least two servers for providing the same service, such as providing the same service in a load balancing manner, embodiments of the present application are not limited in this respect. The server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing cloud services, cloud database 130, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDN (Content Delivery Network ), basic cloud computing services such as big data and an artificial intelligence platform. The server may also become a node of the blockchain, or the like.
It can be appreciated that, in the embodiment of the present application, the computing device 120 used in executing the database table structure changing method is not particularly limited, and the related devices mentioned above can be flexibly combined and utilized according to the actual application scenario.
The specific type of the database 130 is Not limited in the embodiments of the present application, for example, the database 130 may be a relational database (i.e., SQL database) or a non-relational database (i.e., not Only SQL database).
The principles of the database table structure modification method provided by the embodiments of the present application are generally described below based on the above description, so that a relevant person may more clearly understand the relevant implementation manner of the embodiments of the present application.
In the embodiment of the present application, the user sends the table structure changing instruction through the application client 110, and after receiving the table structure changing instruction, the computing device 120 may determine a first table to be changed in the table structure in the database 130, and further determine a storage manner of data in the first table in the database 130. After determining the storage manner of the data in the first table in the database 130, the computing device 120 determines N first data ranges corresponding to the first table based on the storage manner. The N first data ranges may be understood as being obtained by dividing data in the first table, for example, when storing, the data in the first table is stored in blocks based on different storage modes, so that when the data in the first table is obtained, the data in the first table is not directly read from the database 130, but N first data ranges corresponding to the first table are determined based on the storage modes corresponding to the first table, and the data included in the N first data ranges form the data in the first table. In the embodiment of the present application, when the storage manners of the data in the first table in the database 130 are different, the manner in which the computing device 120 determines N first data ranges corresponding to the first table is also different. In the embodiment of the present application, P parallel resources are preset, and after N first data ranges corresponding to the first table are determined by the computing device 120, data in the N first data ranges are migrated in parallel to the second table through the P parallel resources, so that parallel migration of data is realized, and further, the speed of table structure change is improved.
It should be noted that, the application scenario of the embodiment of the present application includes, but is not limited to, that shown in fig. 2.
The following describes the technical scheme of the embodiments of the present application in detail through some embodiments. The following embodiments may be combined with each other, and some embodiments may not be repeated for the same or similar concepts or processes.
Fig. 3 is a flowchart illustrating a method for changing a database table structure according to an embodiment of the application. The method of the embodiment of the present application may be performed by an apparatus having a database table structure changing function, for example, by a database table structure changing apparatus, which may be the computing device shown in fig. 2 described above or a part of the computing device shown in fig. 2. For ease of description, embodiments of the present application will be described with respect to a computing device as an example of an execution body.
As shown in fig. 3, the method of the embodiment of the present application includes:
s301, responding to a table structure changing instruction, and determining a first table with a table structure to be changed in a database.
The table structure changing instruction is used for indicating that the table structure of the first table is changed into the table structure of the second table.
In the embodiment of the present application, the table structure of the first table is different from the table structure of the second table, where the second table may be an empty table or a non-empty table, and the embodiment of the present application does not limit whether the second table already includes data. The table structure of the second table is the target table structure of the first table, for example, due to the problems of business requirement, data weight reduction and the like, the table structure of the first table is expected to be changed into the table structure of the second table, and at this time, the table structure change in the database is involved.
In the embodiment of the application, after receiving the table structure changing instruction sent by the application client, the computing device responds to the table structure changing instruction to determine the first table of the table structure to be changed in the database. Illustratively, the table structure change instruction includes an identification of the first table, where the identification of the first table may be located to identify the first table, so that the computing device may accurately determine the first table in the database via the identification.
The embodiment of the application does not limit the specific expression form of the identification of the first table.
In one example, the identification of the first table is the name or naming of the first table, in which case the database does not include a table that is the same or similar to the name of the first table, so that the computing device may be located by the name or naming of the first table to determine the first table.
In one example, the identification of the first table is an ID or a number of the first table, that is, when the database stores the tables, a unique ID or a number is set for each table, where the IDs or numbers corresponding to different tables are different, so that the first table can be uniquely identified by the ID or the number. Based on the above, the computing device may parse the table structure change instruction to obtain an ID or number corresponding to the first table, and accurately determine the first table in the database.
In some embodiments, if the database includes the second table, the table structure change instruction further includes an identification of the second table. The computing device first needs to determine the second table before writing the data in the first table to the second table. The computing device then looks up the second table in the database based on the identification of the second table. The computing device determines whether the free space of the second table can store data in the first table below. For example, assuming that the database metadata of the embodiment of the present application includes the spatial resource sizes of the tables, the spatial resource sizes have been used, and the spatial resource sizes may be used, where the usable spatial resource sizes may be understood as free resource sizes. Based on this, the computing device may obtain the data size of the data in the first table and the available space resource size of the second table from the metadata of the database, and if the available space resource size of the second table is greater than or equal to the data size of the data in the first table, it may determine that the data in the first table may be stored in the second table, at which time the computing device does not recreate the second table. If the size of the available space resources of the second table is smaller than the size of the data amount of the data in the first table, it may be determined that the data in the first table cannot be stored completely in the existing second table of the current database, and in one example of this case, the computing device expands the second table in the current database to increase the capacity of the current second table, so that the increased capacity second table may store the data in the first table. In another example of this, the computing device recreates the second table, e.g., the computing device parses a table structure of an existing second table in the database, creates a new second table from the new one in the database based on the table structure of the second table, and migrates the data in the first table into the new second table.
In some embodiments, if the database does not include the second table, the table structure change instruction further includes table structure information of the second table. In this way, the computing device obtains the table structure information of the second table by analyzing the table structure change instruction, and then creates the second table in the database based on the table structure information of the second table, so as to migrate the data in the first table to the second table, thereby realizing the change of the table structure.
The computing device, in response to the table structure change instruction, after determining the first table in the database in which the table structure is to be changed, performs the following step S302.
S302, determining a storage mode of the data in the first table in the database.
In the embodiment of the application, the data storage modes corresponding to the databases of different types may be different, and the data storage modes in the databases of the same type may be different.
In the embodiment of the present application, when the table structure is changed, data in N first data ranges corresponding to the first table are migrated in parallel to the second table, where the N first data ranges are determined based on the storage mode of the data in the first table in the database, and therefore, in the embodiment of the present application, it is necessary to determine the storage mode of the data in the first table in the database.
In some embodiments, if the database of the embodiment of the present application is a TDSQL database, TDSQL is a distributed database system compatible with MySQL, autonomous and controllable, and having high consistency. The method supports horizontal automatic splitting, has a complete business logic table, and the data is uniformly split into a plurality of physical fragments, so that the ultra-large concurrent, ultra-high performance and ultra-large capacity OLTP class scenes are effectively solved.
TDSQL has a large number of versions, with TDSQL-3.0 being a completely new architecture of a computationally separate distributed database system. FIG. 4 is a schematic diagram of a TDSQL-3.0 system, which, as shown in FIG. 4, mainly includes, from bottom to top, storage nodes, control nodes, computing nodes, and application clients.
The application client can be understood as an application client corresponding to the database, and a user realizes the operation on the database through the application client on the terminal device, so that the operation of all database administrators (Database Administrator, abbreviated as DBA) can be completed on a user interface without logging in the background.
The application client can be directly connected to the computing node through Java database connection (Java Database Connectivity, JDBC for short) or open database connection (Open Database Connectivity, ODBC), or can be connected to the computing node through load balancing F5, load balancing (load balancing) or Linux virtual server (Linux Virtual Server, LVS for short) and the like, so as to achieve the purpose of flow balancing.
The computing node is a TDSQL computing engine, so that the computing layer and the storage layer are separated. The computing layer mainly performs coordination related to distributed transactions, such as distributed optimization, specific distributed planning, distributed transaction control, storage node load balancing, user authentication, authentication and the like. In addition, the TDSQL computing node also has the capability of online analysis processing (on-Line Analytic Processing, OLAP for short), and can carry out algorithmic optimization on some complex calculations.
The management nodes are divided into cluster management and operation management platforms. The Cluster management part comprises a Metadata server (Metadata server), a Cluster Manager (Cluster Manager) and a Proxy Manager (Proxy Manager) for Metadata management, and the part realizes management and maintenance of the whole Cluster Metadata, data nodes and computing nodes under the distributed database shelf.
The storage node is used for storing data. For example, TDSQL has two storage modalities, one is a shared-nothing (Noshard) database and one is a distributed database (also called shared version TDSQL). Where Nocard is a single version of TDSQL. The second is a distributed database with horizontal scalability. In TDSQL-3.0, storage nodes store data in a distributed manner.
In one data storage mode of TDSQL, data is stored in a storage node according to a logical range, which is referred to as range, for example, data in a table is divided into a plurality of ranges and stored in the storage node.
That is, in TDSQL, data in a first table is divided into a plurality of data ranges stored in a storage node.
In some embodiments, if the database of the embodiments of the present application is a database that is organized based on a b+ tree, no data is stored at non-leaf nodes in the b+ tree, only as an index, and only leaf nodes are stored data. For example, a root node includes at least 2 child nodes, with at most m child nodes per node. All the leaf nodes are in the same layer, all the leaf nodes are connected into a double-linked list, and the key words of the leaf nodes are sequentially ordered from small to large. Pointers and key values are stored in other nodes in the b+ tree than the leaf node. The leaf nodes have key values and data stored therein.
That is, in a database that is organized based on b+ trees, the data in the first table is stored in leaf nodes.
In some embodiments, the database stores data in other manners, and the specific manner of storing the data in the database according to the embodiments of the present application is not limited.
In the embodiment of the application, the computing device can determine the storage mode of the data in the first table in the database from the metadata of the database. For example, the computing device obtains a type of the database from the database metadata, and further determines a storage manner of the data in the first data in the database based on the type of the database. By way of example, assuming that the database type of the embodiment of the present application is the database in the TDSQL-3.0 system shown in fig. 4, it may be determined that the data in the first table is divided into a plurality of data ranges, that is, into a plurality of ranges, to be stored in the storage nodes. Assuming the database of the embodiment of the present application is a MySQL database, it may be determined that the data in the first table is stored in leaf nodes of the b+ tree.
The computing device determines, based on the above method, a storage manner of the data in the first table in the database, and then performs the following step S303.
S303, determining N first data ranges corresponding to the first table based on a storage mode.
The data included in the N first data ranges form data in the first table, and N is a positive integer greater than 1.
In the embodiment of the application, the data in the N first data ranges form the data of the first table, that is, the data in the N first data ranges are spliced and combined to obtain the data in the first table.
In the embodiment of the application, the storage modes of the data in the first table in the database are different, and the modes of the computing equipment for determining N first data ranges corresponding to the first table are also different. That is, in the embodiment of the present application, N first data ranges corresponding to the first table are determined based on the storage manner of the data in the first table in the database.
In some embodiments, a primary key is provided for a first table if the data in the first table is stored in the database. A primary key is a column or columns in a table whose value can uniquely identify each row in the table. The physical integrity of the table may be enhanced by the primary key. The primary key may be created by defining a primary key (PRIMARY KEY) constraint when creating or modifying the table. A table can only have one PRIMARY KEY constraint and columns in the PRIMARY KEY constraint cannot accept null values. Since PRIMARY KEY constraints ensure unique data, identification columns are often defined. It follows that the primary key may uniquely identify the first table. At this time, the above step S303 includes the steps of S303-A and S303-B as follows:
S303-A, determining a main key corresponding to the first table;
S303-B, determining N first data ranges corresponding to the first table based on the storage mode and the primary key.
In this embodiment, the computing device receives the table structure table-promotion instruction and determines a first table of the database for which the table structure is to be changed, e.g., determines an identification or name of the first table, based on the table structure table-promotion instruction. According to the embodiment of the application, the database metadata comprises the primary keys corresponding to the different tables, so that the computing equipment queries the primary keys corresponding to the first table in the database metadata based on the identification or the name of the first table.
After determining the primary key corresponding to the first table, the computing device determines N first data ranges corresponding to the first table based on the storage mode of the data in the first table in the database and the primary key corresponding to the first table. For example, based on the storage mode of the data in the first table in the database, the data range, the data entry, the leaf node, etc. corresponding to the primary key are queried in the database through the primary key, and further based on the data range, the data entry, the leaf node, etc. corresponding to the primary key, the N data ranges corresponding to the first table are constructed.
The following describes the specific implementation procedure of S303-B in an exemplary manner, but in the embodiment of the present application, the implementation procedure of S303-B includes, but is not limited to, the following.
In one aspect, if the storage method is to divide the data in the first table into a plurality of data ranges for storage, the step S303-B includes the steps of S303-B-11 and S303-B-12 as follows:
S303-B-11, acquiring N second data ranges comprising a primary key from a storage node of a database;
S303-B-12, determining the N first data ranges based on the N second data ranges.
In the first mode, the database in the embodiment of the present application is a database in the TDSQL-3.0 system shown in fig. 4, and the data in the first table is stored in the database in a mode of dividing the data in the first table into a plurality of data ranges, that is, into a plurality of ranges, and storing the data in the storage node.
For example, the first table includes 10 data of 1 to 10, and in the storage node, the 10 data may be divided into a plurality of data ranges in logical order, assuming that the data ranges are divided into 2 data ranges, [1,5 ] and [5,10], respectively.
In one example, in the database, each piece of data is organized in a KV structure. The method of associating the table structure with the KV data is to allocate a primary Key to each table data, where the primary Key is used as a prefix of the Key, that is, all the data in the first table have the prefix, so that the data belonging to the first table can be naturally and continuously stored together. For example, the first table has a table structure of (a INT PRIMARY KEY, b INT). Where a INT PRIMARY KEY represents the primary key column of the first table and bint is the data column. Assuming that an id=10001 is assigned to the primary key, when stored, this id will be the common prefix of all data in the first table, and the KV data entry to be stored is assembled with this prefix.
Illustratively, in the first table, one data entry prefixed may be represented as: key [ id+a value+reserved information (noted …) ] Value [ b Value ]. Where id can be understood as the key value of the primary key of the first table. Assuming that the first table has data (1, 1), (2, 2), (3, 3), the 3 data entries in the first table are denoted as { [100011 … ] [1] }, { [100012 … ] [2] }, and { [100013 … ] [3] }, respectively. In the storage of the storage node, since 3 data entries in the first table have the same prefix 10001, they can be physically stored in a continuous area, and logically divided into a plurality of ranges.
As can be seen from the foregoing, in the first embodiment, each data in the first table is assembled into data entries, for example, into data entries in KV structure, and a prefix corresponding to the first table is added to each data entry, and the data entries with the prefixes are divided into a plurality of data ranges according to a logical order, that is, into a plurality of ranges, and stored in the storage nodes, where different ranges are not logically related. In this way, the computing device queries, in the table structure change operation of the first table, N second data ranges including the primary key corresponding to the first table from the storage nodes of the database.
In some embodiments, one data range (i.e. range) includes only data of one table, that is, each of the determined N second data ranges includes only data of the first table and includes no data other than data of the first table, the obtained N second data ranges are determined as N first data ranges corresponding to the first table.
In some embodiments, if at least one of the N second data ranges includes data other than the data in the first table, then the data entries other than the data entries including the primary key are culled from the data entries included in the N second data ranges, resulting in N first data ranges.
In the embodiment of the present application, the storage node does not know or need to know which data are in the same table when performing range division, so that it is determined in S303-B-11 that N second data ranges are likely to carry data belonging to other tables.
For example, assuming that the ID (i.e., key value) of the primary key corresponding to the first table is 100001, the computing device scans all data ranges including 10001 prefixes in the storage nodes of the database, and assuming that 3 second data ranges are found, denoted as a second data Range1 (i.e., range 1), a second data Range2 (i.e., range 2), and a second data Range3 (i.e., range 3), respectively. By way of example, these 3 ranges are each as follows:
Range1->[100001,100011]
[100001…]
[100002…]
[100010…]
[100011…]
Range2->[100012,100014]
[100012…]
[100013…]
[100014…]
Range3->[100015,100021]
[100015…]
[100016…]
[100020…]
[100021…]
Wherein the second data Range1 (i.e. Range 1) includes data in the table with the primary key 10000 in addition to data in the first table. The second data Range2 (i.e., range 2) includes data that is all of the data in the first table. The second data Range3 (i.e., range 3) includes data in the table with the primary key 10002 in addition to data in the first table. It is clear that data that do not belong to the first table cannot be written to the new second table and therefore it is necessary to discard the data in these non-first tables.
Specifically, the computing node eliminates the data in the other tables from the 3 second data ranges, and only includes the data in the first table, namely, eliminates the data entries except the data entries including the primary key of the first table from the data entries included in the 3 second data ranges, so as to obtain N first data ranges corresponding to the first table.
Illustratively, the data in the non-first table is removed from the above 3 second data ranges by taking the form of an intersection. For example, as shown in fig. 5, intersections of the primary keys id (i.e., 10001) corresponding to the first table and the second data range1 are obtained, as shown in fig. 5, the portion where the primary keys corresponding to the first table and the second data range1 intersect is determined as the first data range1 corresponding to the second data range1, the entire second data range2 is determined as the first data range2 corresponding to the second data range2, the portion where the primary keys corresponding to the first table and the second data range3 intersect is determined as the first data range3 corresponding to the second data range3, and 3 first data ranges are obtained.
In the first aspect, the description is made in such a manner that N first data ranges corresponding to the first table are determined when the data in the first table is stored in the database in such a manner that the data in the first table is divided into the plurality of data ranges. In some embodiments, the computing device may also determine N first data ranges corresponding to the first table in the following manner.
In a second aspect, if the storage method is to store the data in the first table in the leaf node of the tree, the step S303-B includes the steps of:
S303-B-21, obtaining M leaf nodes corresponding to a primary key from a database, wherein M is a positive integer greater than 1;
S303-B-22, merging the data included in the M leaf nodes to obtain N first data ranges.
In this second mode, it is assumed that the database according to the embodiment of the present application is a database that performs data organization based on the b+ tree, and at this time, the data in the first table are all stored in leaf nodes of the b+ tree. Thus, after determining the primary key corresponding to the first table based on the steps, the computing device queries the b+ tree based on the primary key, and further queries M leaf nodes corresponding to the primary key from the database.
In some embodiments, at least one of the M leaf nodes includes only one data in the first table.
In some embodiments, at least one of the M leaf nodes includes a plurality of data in the first table.
Because less data is stored in the leaf nodes, after the computing equipment acquires M leaf nodes corresponding to the primary key from the database, the data contained in the M leaf nodes are combined to obtain N first data ranges.
In one example, data included in a preset number of leaf nodes in the M leaf nodes are combined into one first data range, so as to obtain N first data ranges. In this example, if M is an integer multiple of the preset number, and the data size included in each of the M leaf nodes is the same, the data size included in each of the determined N first data ranges is also the same. In this example, if the number of the at least one of the M leaf nodes is different, the data size included in at least one of the determined N first data ranges may be different from the data size included in the other first data ranges.
In another example, the data included in the M leaf nodes are combined to form N first data ranges, each of which includes the same amount of data. For example, if the size of data included in one of the M leaf nodes is large, the data in the leaf node may be used as a first data range without merging with the data in other leaf nodes. For another example, if the amount of data included in one of the M leaf nodes is small, the data included in the several leaf nodes may be combined into a first data range. In this example, the data amount included in each of the N first data ranges is the same, so that load balancing can be achieved when data in the N first data ranges are migrated in parallel.
The first and second modes describe the procedure of determining the N first data ranges corresponding to the first table according to the embodiment of the present application. In the embodiment of the present application, in addition to the first and second modes, N first data ranges corresponding to the first table may be determined in other modes, for example, the first table is stored in a database in a form of a complete table, so that the first table may be divided into N first data ranges.
The computing device determines N first data ranges corresponding to the first table based on the above method, and then executes the following step S304.
S304, data in the N first data ranges are parallelly migrated to the second table through the preset P parallel resources.
Wherein P is a positive integer greater than 1.
In the embodiment of the application, the computing device determines the N first data ranges corresponding to the first table through the method, so that the computing device parallelly migrates the data in the determined N first data ranges to the second table through the preset P parallel resources.
The embodiment of the application does not limit the specific mode that the computing equipment transfers the data in the N first data ranges to the second table in parallel through the preset P parallel resources.
In some embodiments, if P is greater than N, that is, the number N of the first data ranges corresponding to the first table is greater than the number P of the preset parallel resources, and it is expected that the execution of the N first data ranges is completed at one time, the step S304 includes the following steps S304-A1 and S304-A2:
S304-A1, merging the N first data ranges into P third data ranges, wherein the P third data ranges are not overlapped with each other;
S304-A2, the P third data ranges are allocated to the P parallel resources one by one, so that the P parallel resources migrate the data in the P third data ranges to the second table in parallel.
In this embodiment, the computing device merges the N first data ranges to obtain P third data ranges, so that the number P of the third data ranges is the same as the number P of the parallel resources, and further the P third data ranges may be allocated to the P parallel resources one by one, so that one parallel resource processes data in one third data range. Therefore, the data in the first table can be migrated to the second table in parallel at one time, and the speed of changing the table structure is improved.
The embodiment of the application does not limit the specific way of merging the N first data ranges into the P third data ranges.
In one possible implementation manner, at least 2 first data ranges adjacent to the logic range in the N first data ranges are combined to obtain P third data ranges.
For example, the left and right sections of the two first data ranges are integrated, such as the first data ranges [1,3] and the first data ranges [4,5] are combined, to obtain the combined third data ranges [1,5].
As another example, let n=3, and the 3 first data ranges be the first data range 1, the first data range 2, and the first data range 3 in fig. 5, respectively. Assuming that p=2, the 3 first data ranges are combined into 2 third data ranges, specifically, the first data range 1 and the first data range 2 may be combined into one third data range, and the first data range 3 is taken as a single third data range. The second data range 2 and the first data range 3 can also be combined into a third data range, the first data range 1 being a separate third data range. It should be noted that the first data range 1 and the first data range 3 cannot be merged, because the third data range after the first data range 1 and the first data range 3 are merged includes the first data range 2, and therefore, the data of the first data range 2 is repeatedly written into the second table when the first data range 1 and the first data range 3 are merged.
In another possible implementation, the step S304-A1 includes the steps of S304-A11 and S304-A12 as follows:
S304-A11, sequencing the N first data ranges;
S304-A12, merging at least two adjacent first data ranges in the N sequenced first data ranges to obtain P third data ranges.
In this implementation, the computing device orders the N first data ranges corresponding to the first table, e.g., in order of the size of the interval, from small to large or from large to small. For example, the N first data ranges are sorted in order of the sections from smaller to larger, and the value of the left section of the next first data range is not smaller than the value of the right section of the previous first data range among the N first data ranges after sorting. For another example, the N first data ranges are sorted in order of the sections from the larger section to the smaller section, and the value of the left section of the next first data range is not larger than the value of the right section of the previous first data range among the N first data ranges after sorting.
After the computing equipment sorts the N first data ranges, merging at least two adjacent first data ranges in the sorted N first data ranges to obtain P third data ranges with non-overlapping intervals.
The embodiment of the application does not limit the specific mode of merging the N first data ranges after sequencing.
In one example, at least two arbitrary adjacent first data ranges in the N sorted first data ranges are combined to obtain P third data ranges.
For example, n=10 and p=2, and then, from the 10 first data ranges after sorting, any adjacent first data ranges are combined to obtain 2 third data ranges. For example, the first 8 first data ranges of the 10 first data ranges after sorting are merged into one third data range, and the second 2 first data ranges are merged into one third data range.
In one possible implementation of this example, the N sorted first data ranges are combined to obtain P third data ranges based on the size of the data amount. For example, adjacent first data ranges with smaller data volume in the N sorted first data ranges are combined, and the first data ranges with larger data volume are not combined, so that the data volume included in each third data range in the generated P third data ranges is the same or basically the same, and load balancing is further achieved.
In one possible implementation of this example, the N first data ranges that are ordered may be combined based on the performance of the parallel resources. For example, p=2, that is, the performance of the parallel resource 1 is weaker and the performance of the parallel resource 2 is stronger, the third data range obtained by combining more first data ranges in the N sorted first data ranges may be configured to the parallel resource 2, and the third data range obtained by combining fewer first data ranges or the first data range not combined may be configured to the parallel resource 1.
In another example, the computing device determines a first value based on a number of first data ranges and a number of parallel resources; and merging each first data range of the N first data ranges after sequencing into a third data range to obtain P third data ranges.
In this example, the same number, i.e., the first number of first data ranges, of the N first data ranges after sorting is merged into one third data range.
The embodiment of the application does not limit the specific mode of determining the first numerical value based on the number of the first data range and the number of the parallel resources.
For example, a value obtained by rounding down the ratio of the number of the first data ranges to the number of the parallel resources is determined as a first numerical value.
Illustratively, the first value is determined based on the following formula:
M=Floor(N/P) (1)
wherein M is a first value, and Floor is rounded down.
For example, let n=10, p=3, and let the value rounded down based on the ratio of the number of first data ranges 10 and the number of parallel resources 3 be 3, i.e. the first value is 3. At this time, the computing device merges every 3 first data ranges of the 10 first data ranges after sorting into one third data range, and finally, 1 first data range remains, and merges the remaining 1 first data range into the last third data range, so as to finally obtain 3 third data ranges.
For another example, a value obtained by rounding up the ratio of the number of the first data ranges to the number of the parallel resources is determined as the first numerical value.
Illustratively, the first value is determined based on the following formula:
M=ceil(N/P) (2)
wherein M is a first value and ceil is rounded up.
For example, let n=10, p=3, and let 4 be the value rounded up based on the ratio of the number of first data ranges 10 and the number of parallel resources 3, i.e. the first value is 4. At this time, the computing device merges every 4 first data ranges of the 10 first data ranges after sorting into one third data range, and finally, the remaining 2 first data ranges are in one third data range, so as to finally obtain 3 third data ranges.
The above embodiment describes a process of merging N first data ranges into P third data ranges when N is greater than P, so that P parallel resources migrate data in the P third data ranges to the second table in parallel.
In some embodiments, if N is greater than P, the parallel migration may be further performed by the following manner, where S304 includes the following steps:
S304-B1, obtaining P first data ranges from N first data ranges;
S304-B2, distributing the P first data ranges to the P parallel resources one by one, so that the P parallel resources migrate the data in the P first data ranges to the second table in parallel;
S304-B3, when the existence of idle parallel resources in the P parallel resources is detected, the rest first data ranges in the N first data ranges are allocated to the idle parallel resources, so that the idle parallel resources migrate the data in the rest first data ranges to the second table, and the rest first data ranges are the first data ranges except the P first data ranges in the N first data ranges.
In this implementation, excessive processing, such as ordering or merging, is not required, but rather each first data range is in parallel units. Specifically, P first data ranges are obtained from N first data ranges, the other first data ranges wait, and the P first data ranges are allocated to P parallel resources one by one, so that the P parallel resources migrate data in the P first data ranges to the second table in parallel. When it is detected that the P parallel resources are idle, for example, the data size of the first data range 1 is smaller, the parallel resources are used up and returned first, so that the remaining first data range can be allocated to the idle parallel resources, so that the idle parallel resources migrate the data in the remaining first data range into the second table. In the implementation manner, after the parallel resources are used up by the first data range with smaller data volume, the parallel resources are released to the subsequent first data range for use, so that the waste of the parallel resources can be avoided to the greatest extent, and the efficiency of changing the table structure is further improved.
And when N is greater than P, the computing equipment performs parallel migration of the data in the N first data ranges to a specific process in the second table through the preset P parallel resources.
In some embodiments, if N is less than P, the computing device may obtain N parallel resources from the P parallel resources, and allocate the N first data ranges to the N parallel resources one by one, so that the N parallel resources migrate the data in the N first data ranges to the second table in parallel. For example, when selecting parallel resources, selection may be performed based on the data amount of the first data range, for example, when the data amount of the first data range is large, parallel resources with relatively strong performance may be selected, and when the data amount of the first data range is small, parallel resources with general performance may be selected, so that N parallel resources may synchronously migrate data in N first data ranges into the second table, thereby avoiding the problem that when parallel resources with poor performance migrate data in the first data range with relatively large data amount, a great amount of time is spent, and the efficiency of the whole table structure change is affected.
According to the database table structure changing method provided by the embodiment of the application, the computing equipment responds to the table structure changing instruction, and determines a first table of which the table structure is to be changed in the database, wherein the table structure changing instruction is used for indicating to change the table structure of the first table into the table structure of a second table; determining a storage mode of data in a first table in a database; based on a storage mode, N first data ranges corresponding to the first table are determined, data included in the N first data ranges form data in the first table, and N is a positive integer greater than 1; and parallelly migrating the data in the N first data ranges to a second table through P preset parallel resources, wherein P is a positive integer greater than 1. Namely, the embodiment of the application transfers the data in the N first data ranges of the first table to the second table in parallel through the P parallel resources, thereby realizing the rapid change of the table structure and further ensuring the normal operation of the database service.
The above embodiment describes the database table structure changing method provided by the embodiment of the present application from an overall perspective. The method for changing the table structure of the database provided by the embodiment of the application is described below by taking a TDSQL database as an example, where the data in the first table is stored in the database by dividing the data in the first table into a plurality of data ranges.
Fig. 6 is a flowchart of a database table structure changing method according to an embodiment of the present application, and the embodiment shown in fig. 6 can be understood as a specific implementation manner of the embodiment shown in fig. 3. The execution subject of this embodiment is a computing device.
As shown in fig. 6, the method of the embodiment of the present application includes:
s401, the client side sends a table structure change instruction to the computing equipment.
The table structure change instruction is for instructing to change the table structure of the first table to the table structure of the second table.
S402, the computing equipment responds to a table structure changing instruction to determine a first table with a table structure to be changed in the database.
S403, the computing device determines a primary key corresponding to the first table.
The specific descriptions of S401 to S403 may refer to the descriptions of the embodiments, and are not repeated here.
S404, the computing device acquires N second data ranges comprising the primary key from the storage nodes of the database.
S405, the computing device determines N first data ranges based on the N second data ranges.
For example, if each of the N second data ranges does not include data other than the data in the first table, the N second data ranges are determined to be N first data ranges.
For another example, if at least one of the N second data ranges includes data other than the data in the first table, then the data entries other than the data entries including the primary key are removed from the data entries included in the N second data ranges, resulting in N first data ranges.
S406, judging whether N is larger than P.
If P is greater than N, i.e., the number of parallel resources P is greater than the number of first data ranges N, the step of S407 or S408 is performed.
If N is less than or equal to P, the following step S409 is performed.
S407, the computing device merges the N first data ranges into P third data ranges, and allocates the P third data ranges to the P parallel resources one by one, so that the P parallel resources migrate the data in the P third data ranges to the second table in parallel.
The specific description of S407 may refer to the description of the above embodiment, and will not be repeated here.
S408, the computing device acquires P first data ranges from the N first data ranges; distributing the P first data ranges to the P parallel resources one by one, so that the P parallel resources migrate the data in the P first data ranges to the second table in parallel; and when the idle parallel resources exist in the P parallel resources, the rest first data ranges in the N first data ranges are allocated to the idle parallel resources, so that the idle parallel resources migrate the data in the rest first data ranges to the second table.
The remaining first data ranges are first data ranges except for the P first data ranges in the N first data ranges.
S409, the computing device acquires N parallel resources from the P parallel resources; and allocating the N first data ranges to the N parallel resources one by one, so that the N parallel resources migrate the data in the N first data ranges to the second table in parallel.
The specific description of S409 may refer to the description of the foregoing embodiments, and will not be repeated here.
In the embodiment of the application, if the storage mode of the data in the first table in the database is that the data in the first table is divided into a plurality of data ranges for storage, namely, if the database in the embodiment of the application is a TDSQL database, the data in the first data range is determined by determining N first data ranges corresponding to the first table and determining the parallel migration mode of the parallel resources based on the number of P parallel resources and the number of N first data ranges, thereby ensuring that the data in the N first data ranges corresponding to the first table are quickly migrated in parallel to the second table, and realizing the quick change of the table structure.
In some embodiments, the table structure change operation may also be canceled during the table structure change.
In one example, a computing device receives a first request to indicate cancellation of a table structure change operation of a first table; in response to the first request, the P parallel resources are controlled to stop migrating data of the first table into the second table. For example, as shown in fig. 7, a user may operate the parallel resources through the application client, and for an example, if during a table structure changing process of the first table, the user sends a first request through the application client, where the first request is used to request to cancel a table structure changing operation of the first table, the computing device responds to the first request, and controls P parallel resources to stop migrating data of the first table to the second table. Alternatively, the application client may be on a computing device, or on another terminal device.
In another example, the present application may configure one signal processing apparatus for each of the P parallel resources, where the signal processing apparatus may be hardware, software, or a module that combines both. The client may communicate directly with the signal processing apparatus without going through the computing device. At this time, as shown in fig. 8, the user sends a first request to the signal processing device in the parallel resource through the client, and the signal processing device stops migrating the data of the first table to the second table based on the first request.
In some embodiments, fault detection may also be performed.
In one example, a computing device, upon detecting a failure of a first parallel resource of the P parallel resources, sends failure information to a client, the failure information being used to indicate the failure of the first parallel resource. For example, as shown in fig. 9, the computing device detects parallel resources in real time, and when a parallel resource failure is detected, for example, a system failure or a data failure, failure information is reported to the application client. The user or scheduler decides whether to ignore the error or signal to cancel the change task to all parallel resources based on the type of error.
In another example, the present application may configure an error handling device for each of the P parallel resources, which may be hardware or software or a combination of both. The client may communicate directly with the error handling apparatus without going through the computing device. At this time, as shown in fig. 10, each parallel resource may face various failures in processing its own task, including but not limited to a system failure, a data failure, and the like. The parallel resources need to rely on error handling means for error handling. For example, if the parallel resource 1 encounters a system failure, the error handling device captures the failure and reports it to the application client. Illustratively, the user or scheduler decides whether to ignore the error or signal to cancel the change task to all parallel resources by the signal processing means depending on the type of error.
In some embodiments, if the user or scheduler has not always received an error report and has not actively sent a signal to cancel the task, then after all of the parallel resources complete the task, the parallel change is complete and the user or scheduler records the success status of the parallel change. And when any one of the parallel resources fails to perform a task due to encountering various exceptions, all of the parallel resources cease to operate because the data is incomplete. Or, if the scheduler does not record the successful state of the change due to system factors such as power failure, the whole table structure change fails, and the system rolls back the data to the state before the table structure change.
It should be understood that fig. 3-10 are only examples of the present application and should not be construed as limiting the present application.
The preferred embodiments of the present application have been described in detail above with reference to the accompanying drawings, but the present application is not limited to the specific details of the above embodiments, and various simple modifications can be made to the technical solution of the present application within the scope of the technical concept of the present application, and all the simple modifications belong to the protection scope of the present application. For example, the specific features described in the above embodiments may be combined in any suitable manner, and in order to avoid unnecessary repetition, various possible combinations are not described further. As another example, any combination of the various embodiments of the present application may be made without departing from the spirit of the present application, which should also be regarded as the disclosure of the present application.
The method embodiment of the present application is described in detail above with reference to fig. 3 to 10, and the apparatus embodiment of the present application is described in detail below with reference to fig. 11 to 12.
Fig. 11 is a schematic block diagram of a database table structure changing apparatus according to an embodiment of the present application. The apparatus 10 may be the computing device described above or be part of a computing device.
As shown in fig. 11, the database table structure changing apparatus 10 includes:
a table determining unit 11 configured to determine a first table of which a table structure is to be changed in a database in response to a table structure changing instruction for instructing to change the table structure of the first table to a table structure of a second table;
a storage mode determining unit 12 configured to determine a storage mode of the data in the first table in the database;
a processing unit 13, configured to determine N first data ranges corresponding to the first table based on the storage manner, where data included in the N first data ranges form data in the first table, and N is a positive integer greater than 1;
and the changing unit 14 is configured to migrate, in parallel, data in the N first data ranges to the second table through P preset parallel resources, where P is a positive integer greater than 1.
In some embodiments, the processing unit 13 is specifically configured to determine a primary key corresponding to the first table; and determining the N first data ranges corresponding to the first table based on the storage mode and the primary key.
In some embodiments, the processing unit 13 is specifically configured to obtain N second data ranges including the primary key from a storage node of the database if the storage manner is to divide the data in the first table into a plurality of data ranges for storage; the N first data ranges are determined based on the N second data ranges.
In some embodiments, the processing unit 13 is specifically configured to determine the N second data ranges as the N first data ranges if each of the N second data ranges does not include data other than the data in the first table.
In some embodiments, the processing unit 13 is specifically configured to, if at least one second data range of the N second data ranges includes data other than the data in the first table, discard data entries other than the data entries including the primary key from the data entries included in the N second data ranges, and obtain the N first data ranges.
In some embodiments, the processing unit 13 is specifically configured to obtain, from the database, M leaf nodes corresponding to the primary key if the storage manner is to store the data in the first table in the leaf nodes of the tree, where M is a positive integer greater than 1; and merging the data included by the M leaf nodes to obtain the N first data ranges.
In some embodiments, the changing unit 14 is specifically configured to combine the N first data ranges into P third data ranges if the N is greater than the P, where the P third data ranges are not overlapped with each other; and allocating the P third data ranges to the P parallel resources one by one, so that the P parallel resources migrate the data in the P third data ranges to the second table in parallel.
In some embodiments, the changing unit 14 is specifically configured to sort the N first data ranges; and merging at least two adjacent first data ranges in the N sequenced first data ranges to obtain the P third data ranges.
In some embodiments, the changing unit 14 is specifically configured to determine the first value based on the number of the first data ranges and the number of the parallel resources; and merging each first data range of the N first data ranges after sequencing into a third data range to obtain the P third data ranges.
In some embodiments, the changing unit 14 is specifically configured to determine, as the first value, a value obtained by rounding up a ratio of the number of parallel resources to the number of the first data range.
In some embodiments, the changing unit 14 is specifically configured to obtain P first data ranges from the N first data ranges if the N is greater than the P; distributing the P first data ranges to the P parallel resources one by one, so that the P parallel resources parallelly migrate the data in the P first data ranges to the second table; and when detecting that the idle parallel resources exist in the P parallel resources, distributing the rest first data ranges in the N first data ranges to the idle parallel resources so that the idle parallel resources migrate the data in the rest first data ranges into the second table, wherein the rest first data ranges are the first data ranges except the P first data ranges in the N first data ranges.
In some embodiments, the changing unit 14 is specifically configured to obtain N parallel resources from the P parallel resources if the N is less than or equal to the P; and distributing the N first data ranges to the N parallel resources one by one, so that the N parallel resources migrate the data in the N first data ranges to the second table in parallel.
In some embodiments, the processing unit 13 is further configured to receive a first request, where the first request is used to instruct cancellation of a table structure change operation of the first table; and in response to the first request, controlling the P parallel resources to stop migrating the data of the first table into the second table.
In some embodiments, the processing unit 13 is further configured to send failure information to the client when a failure of a first parallel resource of the P parallel resources is detected, where the failure information is used to indicate the failure of the first parallel resource.
It should be understood that apparatus embodiments and method embodiments may correspond with each other and that similar descriptions may refer to the method embodiments. To avoid repetition, no further description is provided here. Specifically, the apparatus shown in fig. 11 may perform an embodiment of a database table structure changing method, and the foregoing and other operations and/or functions of each module in the apparatus are respectively for implementing the foregoing method embodiment, which is not repeated herein for brevity.
The apparatus of the embodiments of the present application is described above in terms of functional modules with reference to the accompanying drawings. It should be understood that the functional module may be implemented in hardware, or may be implemented by instructions in software, or may be implemented by a combination of hardware and software modules. Specifically, each step of the method embodiment in the embodiment of the present application may be implemented by an integrated logic circuit of hardware in a processor and/or an instruction in a software form, and the steps of the method disclosed in connection with the embodiment of the present application may be directly implemented as a hardware decoding processor or implemented by a combination of hardware and software modules in the decoding processor. Alternatively, the software modules may be located in a well-established storage medium in the art such as random access memory, flash memory, read-only memory, programmable read-only memory, electrically erasable programmable memory, registers, and the like. The storage medium is located in a memory, and the processor reads information in the memory, and in combination with hardware, performs the steps in the above method embodiments.
FIG. 12 is a schematic block diagram of a computing device for performing the above-described method embodiments provided by an embodiment of the present application.
As shown in fig. 12, the computing device 30 may include:
a memory 31 and a processor 32, the memory 31 being arranged to store a computer program 33 and to transmit the program code 33 to the processor 32. In other words, the processor 32 may call and run the computer program 33 from the memory 31 to implement the method in an embodiment of the application.
For example, the processor 32 may be configured to perform the above-described method steps according to instructions in the computer program 33.
In some embodiments of the present application, the processor 32 may include, but is not limited to:
a general purpose processor, digital signal processor (Digital Signal Processor, DSP), application specific integrated circuit (Application Specific Integrated Circuit, ASIC), field programmable gate array (Field Programmable Gate Array, FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like.
In some embodiments of the present application, the memory 31 includes, but is not limited to:
volatile memory and/or nonvolatile memory. The nonvolatile Memory may be a Read-Only Memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an Electrically Erasable EPROM (EEPROM), or a flash Memory. The volatile memory may be random access memory (Ran DOM Access Memory, RAM) which acts as an external cache. By way of example, and not limitation, many forms of RAM are available, such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (Double Data Rate SDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), and Direct memory bus RAM (DR RAM).
In some embodiments of the present application, the computer program 33 may be divided into one or more modules that are stored in the memory 31 and executed by the processor 32 to perform the method of recording pages provided by the present application. The one or more modules may be a series of computer program instruction segments capable of performing the specified functions, which instruction segments describe the execution of the computer program 33 in the computing device.
As shown in fig. 12, the computing device 30 may further include:
a transceiver 34, the transceiver 34 being connectable to the processor 32 or the memory 31.
The processor 32 may control the transceiver 34 to communicate with other devices, and in particular, may send information or data to other devices or receive information or data sent by other devices. The transceiver 34 may include a transmitter and a receiver. The transceiver 34 may further include antennas, the number of which may be one or more.
It should be appreciated that the various components in the computing device 30 are connected by a bus system that includes a power bus, a control bus, and a status signal bus in addition to a data bus.
According to an aspect of the present application, there is provided a computer storage medium having stored thereon a computer program which, when executed by a computer, enables the computer to perform the method of the above-described method embodiments.
The present application also provides a computer program product comprising instructions which, when executed by a computer, cause the computer to perform the method of the method embodiment described above.
According to another aspect of the present application, there is provided a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the electronic device reads the computer instructions from the computer readable storage medium and executes the computer instructions to cause the electronic device to perform the method of the above-described method embodiments.
In other words, when implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, produces a flow or function in accordance with embodiments of the application, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by a wired (e.g., coaxial cable, fiber optic, digital subscriber line (digital subscriber line, DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains an integration of one or more available media. The usable medium may be a magnetic medium (e.g., a floppy disk, a hard disk, a magnetic tape), an optical medium (e.g., a digital video disc (digital video disc, DVD)), or a semiconductor medium (e.g., a Solid State Disk (SSD)), or the like.
Those of ordinary skill in the art will appreciate that the various illustrative modules and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
In the several embodiments provided by the present application, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, and for example, the division of the modules is merely a logical function division, and there may be additional divisions when actually implemented, for example, multiple modules or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or modules, which may be in electrical, mechanical, or other forms.
The modules illustrated as separate components may or may not be physically separate, and components shown as modules may or may not be physical modules, i.e., may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. For example, functional modules in various embodiments of the present application may be integrated into one processing module, or each module may exist alone physically, or two or more modules may be integrated into one module.
The foregoing is merely illustrative of the present application, and the present application is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (15)

1. A database table structure changing method, comprising:
determining a first table of which the table structure is to be changed in a database in response to a table structure change instruction, wherein the table structure change instruction is used for indicating to change the table structure of the first table into the table structure of a second table;
Determining a storage mode of the data in the first table in the database;
based on the storage mode, N first data ranges corresponding to the first table are determined, data included in the N first data ranges form data in the first table, and N is a positive integer greater than 1;
and parallelly migrating the data in the N first data ranges to the second table through P preset parallel resources, wherein P is a positive integer greater than 1.
2. The method of claim 1, wherein determining N first data ranges corresponding to the first table based on the storage manner comprises:
determining a primary key corresponding to the first table;
and determining the N first data ranges corresponding to the first table based on the storage mode and the primary key.
3. The method of claim 2, wherein determining the N first data ranges corresponding to the first table based on the storage manner and the primary key comprises:
if the storage mode is that the data in the first table is divided into a plurality of data ranges for storage, acquiring N second data ranges comprising the main key from a storage node of the database;
The N first data ranges are determined based on the N second data ranges.
4. The method of claim 3, wherein the determining the N first data ranges based on the N second data ranges comprises:
and if each of the N second data ranges does not include data other than the data in the first table, determining the N second data ranges as the N first data ranges.
5. The method of claim 3, wherein the determining the N first data ranges based on the N second data ranges comprises:
and if at least one second data range of the N second data ranges comprises data except the first table, eliminating data entries except the data entries comprising the main key from the data entries comprising the N second data ranges to obtain the N first data ranges.
6. The method of claim 2, wherein determining the N first data ranges corresponding to the first table based on the storage manner and the primary key comprises:
if the storage mode is that the data in the first table is stored in the leaf nodes of the tree, M leaf nodes corresponding to the primary key are obtained from the database, wherein M is a positive integer greater than 1;
And merging the data included by the M leaf nodes to obtain the N first data ranges.
7. The method according to any one of claims 1-6, wherein if the N is greater than the P, the concurrently migrating the data in the N first data ranges to the second table through the preset P parallel resources includes:
merging the N first data ranges into P third data ranges, wherein the P third data ranges are not overlapped with each other;
and allocating the P third data ranges to the P parallel resources one by one, so that the P parallel resources migrate the data in the P third data ranges to the second table in parallel.
8. The method of claim 7, wherein the merging the N first data ranges into P third data ranges comprises:
sorting the N first data ranges;
and merging at least two adjacent first data ranges in the N sequenced first data ranges to obtain the P third data ranges.
9. The method of claim 8, wherein merging at least two adjacent first data ranges from the N sorted first data ranges to obtain the P third data ranges, comprises:
Determining a first numerical value based on the number of the first data ranges and the number of the parallel resources;
and merging each first data range of the N first data ranges after sequencing into a third data range to obtain the P third data ranges.
10. The method of claim 9, wherein the determining a first value based on the number of the first data ranges and the number of parallel resources comprises:
and determining a value obtained by rounding up the ratio of the number of the parallel resources to the number of the first data range as the first numerical value.
11. The method according to any one of claims 1-6, wherein if the N is greater than the P, the concurrently migrating the data in the N first data ranges to the second table through the preset P parallel resources includes:
p first data ranges are obtained from the N first data ranges;
distributing the P first data ranges to the P parallel resources one by one, so that the P parallel resources parallelly migrate the data in the P first data ranges to the second table;
And when detecting that the idle parallel resources exist in the P parallel resources, distributing the rest first data ranges in the N first data ranges to the idle parallel resources so that the idle parallel resources migrate the data in the rest first data ranges into the second table, wherein the rest first data ranges are the first data ranges except the P first data ranges in the N first data ranges.
12. The method according to any one of claims 1-6, wherein if the N is less than or equal to the P, the concurrently migrating the data in the N first data ranges to the second table through the preset P parallel resources includes:
acquiring N parallel resources from the P parallel resources;
and distributing the N first data ranges to the N parallel resources one by one, so that the N parallel resources migrate the data in the N first data ranges to the second table in parallel.
13. A database table structure changing apparatus, comprising:
a table determining unit, configured to determine a first table of a table structure to be changed in a database in response to a table structure changing instruction, where the table structure changing instruction is configured to instruct to change the table structure of the first table to a table structure of a second table;
A storage mode determining unit, configured to determine a storage mode of the data in the first table in the database;
the processing unit is used for determining N first data ranges corresponding to the first table based on the storage mode, wherein data included in the N first data ranges form data in the first table, and N is a positive integer greater than 1;
and the changing unit is used for parallelly migrating the data in the N first data ranges to the second table through preset P parallel resources, wherein P is a positive integer greater than 1.
14. An electronic device comprising a processor and a memory;
the memory is used for storing a computer program;
the processor for executing the computer program to implement the method of any of the preceding claims 1 to 12.
15. A computer readable storage medium for storing a computer program for causing a computer to perform the method of any one of the preceding claims 1 to 12.
CN202211221538.6A 2022-09-30 2022-09-30 Database table structure changing method, device, equipment and storage medium Pending CN117009346A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211221538.6A CN117009346A (en) 2022-09-30 2022-09-30 Database table structure changing method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211221538.6A CN117009346A (en) 2022-09-30 2022-09-30 Database table structure changing method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN117009346A true CN117009346A (en) 2023-11-07

Family

ID=88573373

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211221538.6A Pending CN117009346A (en) 2022-09-30 2022-09-30 Database table structure changing method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN117009346A (en)

Similar Documents

Publication Publication Date Title
US20220350819A1 (en) System and method for improved performance in a multidimensional database environment
US11797498B2 (en) Systems and methods of database tenant migration
US10331634B2 (en) Request routing and query processing in a sharded database
US9767131B2 (en) Hierarchical tablespace space management
US11301446B1 (en) System and method for interacting with a plurality of data sources
US9684702B2 (en) Database redistribution utilizing virtual partitions
US20160350302A1 (en) Dynamically splitting a range of a node in a distributed hash table
US8924357B2 (en) Storage performance optimization
CN111984696B (en) Novel database and method
US10706022B2 (en) Space-efficient secondary indexing on distributed data stores
CN104615785A (en) Data storage method and device based on TYKY cNosql
US20180165469A1 (en) Access operation request management
US10102267B2 (en) Method and apparatus for access control
US20220188340A1 (en) Tracking granularity levels for accessing a spatial index
CN107408132B (en) Method and system for moving hierarchical data objects across multiple types of storage
US9703788B1 (en) Distributed metadata in a high performance computing environment
US10521398B1 (en) Tracking version families in a file system
CN117009346A (en) Database table structure changing method, device, equipment and storage medium
EP3696688B1 (en) Locking based on categorical memory allocation
US11204717B2 (en) Object storage system with access control quota status check
US11580082B2 (en) Object storage system with control entity quota usage mapping
US11860869B1 (en) Performing queries to a consistent view of a data set across query engine types
US11914571B1 (en) Optimistic concurrency for a multi-writer database
US20240045878A1 (en) Building and using a sparse time series database (tsdb)
CN117687970A (en) Metadata retrieval method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination