WO2024108994A1 - 一种数据库对象的验证方法及相关设备 - Google Patents

一种数据库对象的验证方法及相关设备 Download PDF

Info

Publication number
WO2024108994A1
WO2024108994A1 PCT/CN2023/100966 CN2023100966W WO2024108994A1 WO 2024108994 A1 WO2024108994 A1 WO 2024108994A1 CN 2023100966 W CN2023100966 W CN 2023100966W WO 2024108994 A1 WO2024108994 A1 WO 2024108994A1
Authority
WO
WIPO (PCT)
Prior art keywords
target database
database object
data
file
database
Prior art date
Application number
PCT/CN2023/100966
Other languages
English (en)
French (fr)
Inventor
李志学
李建峰
潘迪亚•维沙尔•纳维尼特
乔杜里•苏米亚•兰詹
Original Assignee
华为云计算技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为云计算技术有限公司 filed Critical 华为云计算技术有限公司
Publication of WO2024108994A1 publication Critical patent/WO2024108994A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures

Definitions

  • the present application relates to the field of databases, and in particular to a data processing method, system, computing device cluster, computer-readable storage medium, and computer program product.
  • Database migration includes the migration of database objects (such as stored procedures/functions) and object verification after migration.
  • Object verification can include data verification. Specifically, data verification involves first synchronizing the table data that the database object to be verified depends on from the source database to the target database, creating an environment that is consistent with the source database in all aspects, and then running the database object in the target database to obtain the return value and running time, and then comparing them with the return value and time of the source database to determine whether the database object maintains functional consistency and performance consistency with the source database after migration.
  • the data synchronization process is generally divided into three stages: data export, data conversion, and data import. After all three stages are successfully executed, the object verification phase with data can be carried out.
  • the efficiency of exporting data from the source database and then writing it to the target database through the database interface is low, especially when the amount of data is large, it may take several hours or even more time to complete the synchronization.
  • the data preparation phase takes too long, which greatly affects the efficiency of object verification.
  • the present application provides a database object verification method, which exports table data from a source database into a file in advance, and object verification can be initiated at any time without waiting for the table data to be written to the target database, thereby improving the verification efficiency of the database object.
  • the present application also provides a database migration system, a computing device cluster, a computer-readable storage medium, and a computer program product corresponding to the above method.
  • the present application provides a method for verifying a database object.
  • the method can be performed by a database migration system.
  • the database migration system includes a source database and a target database.
  • the source database exports table data on which at least one database object depends to a file.
  • the target database reads the file to obtain table data on which the target database object depends in the at least one database object. Then, the target database verifies the target database object according to the table data on which the target database object depends.
  • the table data can be exported from the source database into a file in advance. Object verification can be initiated at any time without waiting for the table data to be written to the target database, thereby improving the verification efficiency of database objects.
  • the file includes a file header and data. Accordingly, the target data can read the data in the file and then perform format conversion on the data in the file to obtain the table data on which the target database object in the at least one database object depends.
  • the target database when verifying the target database object, the target database reads the data in the file corresponding to the target database object and performs format conversion to obtain the table data that the target database object depends on.
  • files can achieve fast backup, and the data in the files can be reused, achieving the effect of exporting data once and reusing it any number of times.
  • the target database may, according to the data in the file, generate a database object The structure of the table data on which the target database object depends is constructed, and the outer table is constructed, so as to reconstruct the table data on which the target database object depends. Accordingly, when verifying the target database object, the target database can verify the target database object according to the outer table.
  • This method obtains the external table by converting the data format after reading the data in the file. On the one hand, it can provide corresponding data for the verification of database objects to achieve data verification, and on the other hand, it can reduce the resource usage of the target database.
  • the target database when the target database verifies the target database object based on the external table, the target database may run the target database object to perform one or more of insert, delete, query, update, connect, clear, row-level lock, or transaction operations on the external table to obtain a first return value and a first run time. Then, the target database verifies the target database object based on the first return value, the first run time, and a second return value and a second run time obtained by running the target database object on the source database.
  • the target database can compare the first return value and the second return value, and compare the first run time and the second run time, so as to verify the target database object. For example, if the first return value is equal to the second return value, it indicates that after the migration of the target database object, the target database and the source database maintain functional consistency. If the difference between the first run time and the second run time is within a preset range, it indicates that after the migration of the target database object, the target database and the source database maintain performance consistency. If the first return value and the second return value are equal, and the difference between the first run time and the second run time is within a preset range, it indicates that the target database object has passed the verification.
  • This method constructs a foreign table and runs the target database object in the target database to perform corresponding operations on the foreign table. This can achieve the same effect as performing corresponding operations in the internal table of the target database. This can improve verification efficiency on the one hand and accuracy on the other.
  • the target database may call a foreign table access plug-in according to the data in the file, and construct a foreign table according to the structure of the table data on which the target database object in the at least one database object depends.
  • This method constructs a foreign table through a foreign table access plug-in, which can shield differences in database system architectures, has no direct dependency on system architectures, can be applied to various system architectures, and has high availability.
  • the target database includes one or more of PostgreSQL or GaussDB.
  • PostgreSQL or GaussDB supports the development of extension plug-ins, so that the target database can access external files in a foreign table mode based on the developed foreign table access plug-in, providing assistance for verifying database objects based on foreign tables.
  • the data in the file is stored in a binary byte stream format, which can improve the performance of table data export and reduce the space occupied by data storage.
  • the target database can respond to the verification request for the target database object by reading the file and obtaining the table data on which the target database object depends in the at least one database object. That is, the method can verify the corresponding database object on demand, thereby avoiding resource waste and improving resource utilization.
  • the database migration system can be integrated into Database and Application Migration (UGO), hereinafter referred to as UGO.
  • UGO can synchronize objects in the source database to a heterogeneous target database, and supports exporting table data that at least one database object in the source database depends on to a file.
  • the target database reads the file to obtain the table data to implement data verification for the synchronized database objects.
  • This method has a new solution design for the data preparation phase with data object verification, which effectively shortens the duration of the data preparation phase, thereby improving the efficiency of database object verification.
  • the exported file is usually read-only, and the target database can read the data part in the file multiple times, and perform corresponding format conversion according to the read data part, thereby realizing verification of different database objects.
  • This method does not need to repeat the entire data synchronization process, greatly shortens the synchronization time, improves the efficiency of database object verification, and meets business needs.
  • the present application provides a database migration system, the system comprising:
  • a source database used to export table data on which at least one database object depends to a file
  • the target database is used to read the file, obtain table data on which the target database object in the at least one database object depends, and verify the target database object according to the table data on which the target database object depends.
  • the file includes a file header and data
  • the target database is specifically used for:
  • the data in the file is formatted to obtain table data on which the target database object in the at least one database object depends.
  • the target database performs format conversion on the data in the file to obtain table data on which the target database object in the at least one database object depends, specifically for:
  • the target database verifies the target database object according to the table data on which the target database object depends, specifically for:
  • the target database object is validated against the external table.
  • the target database is specifically used for:
  • Run the target database object perform one or more of insert, delete, query, update, connect, clear, row-level lock or transaction operations on the foreign table, and obtain a first return value and a first running time;
  • the target database object is verified according to the first return value, the first running time, and a second return value and a second running time obtained by the source database running the target database object.
  • the target database is specifically used for:
  • a foreign table access plug-in is called to construct a foreign table according to the structure of the table data on which the target database object in the at least one database object depends.
  • the data in the file is stored in a binary byte stream format.
  • the target database is specifically used for:
  • the file is read to obtain table data in the at least one database object on which the target database object depends.
  • the present application provides a computing device cluster.
  • the computing device cluster includes at least one computing device, and the at least one computing device includes at least one processor and at least one memory.
  • the at least one processor and the at least one memory communicate with each other.
  • the at least one processor is used to execute instructions stored in the at least one memory, so that the computing device or the computing device cluster executes the database object verification method as described in the first aspect or any implementation of the first aspect.
  • the present application provides a computer-readable storage medium storing instructions, wherein the instructions instruct a computing device or a computing device cluster to execute the database object verification method described in the first aspect or any one of the implementations of the first aspect.
  • the present application provides a computer program product comprising instructions, which, when executed on a computing device or a computing device cluster, enables the computing device or the computing device cluster to execute the database object verification method described in the first aspect or any one of the implementations of the first aspect.
  • FIG1 is a schematic diagram of a process of performing database migration using a database migration tool provided in an embodiment of the present application
  • FIG2 is a schematic diagram of a process of performing data synchronization using a database migration tool provided in an embodiment of the present application
  • FIG3 is a schematic diagram of the architecture of a database migration system provided in an embodiment of the present application.
  • FIG4 is a schematic diagram of a flow chart of a method for verifying a database object provided in an embodiment of the present application
  • FIG5 is a schematic diagram of a data export provided in an embodiment of the present application.
  • FIG6 is a schematic diagram of the structure of a computing device improved in an embodiment of the present application.
  • FIG7 is a schematic diagram of the structure of a computing device cluster provided in an embodiment of the present application.
  • FIG8 is a schematic diagram of the structure of another computing device cluster provided in an embodiment of the present application.
  • FIG. 9 is a schematic diagram of the structure of another computing device cluster provided in an embodiment of the present application.
  • first and second in the embodiments of the present application are used for descriptive purposes only and should not be understood as indicating or implying relative importance or implicitly indicating the number of the indicated technical features. Therefore, the features defined as “first” and “second” may explicitly or implicitly include one or more of the features.
  • a database is a collection of data that is stored together in a certain way, can be shared by multiple users, has as little redundancy as possible, and is independent of the application program (referred to as application).
  • a database can include multiple tablespaces.
  • the database management system (DBMS) is used to control and implement operations such as adding (i.e. inserting), deleting, querying or modifying (i.e. updating) data in the database.
  • DBMS database management system
  • data and DBMS are collectively referred to as a database system, usually referred to as a database.
  • Database migration refers to the configured or implemented process of transferring data from a source database to a target database, and the data may be converted during the transfer.
  • the transferred data in the source database can be converted into data that is compatible with the target database during the transfer.
  • Database migration usually includes migration of database objects, verification of objects after migration, data migration, data comparison, and application migration.
  • Database objects are the components of a database.
  • Database objects include but are not limited to tables, default values, indexes, stored procedures, triggers, functions, users, or views.
  • a table is a special data structure in a database that is used to store data objects and the relationships between objects. It consists of rows and columns. When creating a column or inserting data in a table, a default value is assigned to a column or column data item that has not been assigned a specific value.
  • An index is a structure that sorts the values of one or more columns in a database table. Using an index, you can quickly access specific information in a database table.
  • a stored procedure is a set of SQL statements that complete a specific function. It is generally used for report statistics, data migration, etc.
  • a trigger is a special type of stored procedure that is triggered by a specified event. It is generally used for data auditing, data backup, etc.
  • a function is an encapsulation of some business logic to complete a specific function. After the function is executed, the execution result can be returned.
  • a user is usually someone who has access to a database.
  • a view is a virtual table derived from one or more basic tables that can be used to control user access to data.
  • a view also has a set of data items and named fields that appear when a user performs a query, but do not actually exist in the database. By controlling user access rights to data, the data is simplified and only the data items that the user needs are displayed.
  • Database migration can usually be implemented by a database migration tool.
  • this application provides a flowchart of a database migration tool performing database migration.
  • the source database is connected to upper-layer applications, specifically application 1 to application n.
  • the database migration tool can extract objects from the source database (also called the source database) to obtain database objects, then parse the database objects and convert the database objects.
  • the SQL syntax of the source database and the target database also called the target database. Therefore, during the migration process, the SQL statements of the source database need to be reconstructed and converted into SQL equivalent to the target database, and then executed in the target database.
  • the database migration tool also synchronizes the table data of the source database to the target database. Similarly, after the data migration, the database migration tool also compares the data of the source database and the target database to see if they are consistent. Furthermore, the customer's application will use a large number of SQL statements to access the database, and these SQL statements also need to be converted into SQL that matches the target database. This process first requires the ability to discover the SQL statements in the application, then convert these SQL statements, and finally rewrite the application system code.
  • the object verification process after migration is relatively complicated, especially the verification of programming objects such as stored procedures and functions, which takes a lot of time. These objects usually have complex business logic and rely on other database objects (such as tables, sequences, views, etc.) and data. If you want to fully verify them and ensure that the functions and performance are consistent with the source database, then data verification is an indispensable part.
  • Data verification means first synchronizing the table data that the database object to be verified (such as stored procedures/functions) depends on from the source database to the target database, creating an environment that is consistent with the source database in all aspects, and then running the database object in the target database to obtain the return value and running time, and then comparing them with the return value and time of the source database to determine whether the object maintains consistency in function and performance with the source database after migration.
  • the database object to be verified such as stored procedures/functions
  • the data synchronization process of database migration tools is generally divided into three stages: data export, data conversion, and data import.
  • data verification can only be carried out after all three stages are successfully executed.
  • the efficiency of exporting data from the source database and then writing it to the target database through the database interface is low, especially when the amount of data is large, it may take several hours or even more time to complete the synchronization.
  • the data preparation stage takes too long, which greatly affects the efficiency of object verification.
  • the present application provides a method for verifying a database object.
  • a source database exports table data on which at least one database object depends to a file
  • a target database reads the file, obtains table data on which the target database object depends in the at least one database object, and then verifies the target database object based on the table data on which the target database object depends.
  • This method shortens the data synchronization process.
  • the table data can be exported from the source database into files in advance.
  • Object verification can be performed at any time. Initiate, there is no need to wait for table data to be written to the target database, thereby improving the verification efficiency of database objects.
  • this method has no direct dependence on the system architecture and can be applied to various system architectures. For example, this method can be applied to scenarios where the target database is PostgreSQL or GaussDB, with high availability.
  • the system includes a source database 10 and a target database 20.
  • the database objects of the source database 10 can be synchronized to the target database 20.
  • the source database 10 is also used to export the stored table data, such as the table data on which at least one database object depends, to a file.
  • the target database is used to read the file to obtain the table data on which the target database object depends in the at least one database object.
  • the target database can read the file through a foreign table access plug-in to obtain the table data on which the target database object depends in at least one database object without having to go through the database storage. Then, the target database can verify the target database object based on the table data on which the target database object depends.
  • the three stages of data synchronization are shortened to one stage, retaining only data export.
  • the exported data is stored in files of a specific data format instead of being directly written to the target database. This shortens the data synchronization time, and data can be exported from the source database to files in advance. Object verification can be initiated at any time without waiting for data to be written to the target database.
  • the target database is a database that supports the development of extension plug-ins, including but not limited to PostgreSQL and GaussDB.
  • extension plug-ins including but not limited to PostgreSQL and GaussDB.
  • Developing a foreign table access plug-in for the target database allows the target database to directly access external files in the form of tables through the foreign table access plug-in, that is, the plug-in performs data conversion when reading the data in the file to obtain table data for database object verification.
  • the table data used to verify database objects can be reused, achieving the effect of exporting data once and reusing it any number of times, and solving the problem in some extreme test scenarios that in order to ensure that the initial data for each verification is consistent, after the traditional data migration tool successfully synchronizes the table data, if the table data is modified (such as: modified by other object verification or manually modified), then these data will become unavailable, and the entire data synchronization process needs to be repeated, and the data needs to be re-imported, which will lead to a very long verification process.
  • modified such as: modified by other object verification or manually modified
  • FIG3 introduces the architecture of the database migration system of the present application.
  • a method for verifying a database object in a database migration process provided in an embodiment of the present application is introduced in conjunction with the accompanying drawings.
  • the method includes the following steps:
  • the source database exports table data on which at least one database object depends to a file.
  • Database objects are components of a database, including but not limited to stored procedures and functions.
  • the operation of database objects such as stored procedures and functions depends on table data.
  • a database object can depend on at least one table data, and different database objects can depend on different table data, for example, database object A depends on table data 1, and database object B depends on table data 2.
  • the data that different database objects depend on may have an intersection, for example, database object A and database object B also depend on table data 3, that is, database object A depends on table data 1 and table data 3, and database object B depends on table data 2 and table data 3.
  • the source database can export at least one table data that the database object depends on to a file.
  • the file is an external file independent of the target database.
  • the source database can export data to form a set of standardized logic, so that the exported data is stored in a unified format.
  • the data storage format can be directly stored as a binary byte stream.
  • each table data can be exported as a file.
  • the file includes a file header and data.
  • the file header stores metadata of the table data, such as column names, and the above metadata can be used to characterize the structure of the table data (also referred to as table structure).
  • the file header can also be metadata of the file, such as one or more of the export time, the last modification time, and the creator.
  • the data part may include the value of each column.
  • each row of the data part can store the length of each column and the corresponding value. Considering that the row length may be inconsistent, each row of the data part can also store the row length. That is, each row can store the row length, column 1 length, column 1 value, column 2 length, column 2 value...column N length, column N value. Wherein, N is a positive integer.
  • the target database can identify the file header and the data part in the file, and then read the data part in the file. Specifically, the target database can access the plug-in through the external table to read the data part in the file.
  • the target database constructs a foreign table according to the data in the file and the structure of the table data on which the target database object in the at least one database object depends.
  • the target database object may be one or more database objects in the database objects synchronized from the source database to the target database.
  • the target database may construct a foreign table based on the data in the file data and the structure of the table data on which the target database object depends.
  • the target database can obtain multiple column names of the table data, initialize the table data according to the multiple column names, and for each row of data in the data part of the file, assign the column values of each column to the table items corresponding to each column in the corresponding row in the initialized table data, thereby realizing the construction of the external table.
  • the target database when constructing a table, can call a table access plug-in according to the data in the file, and construct the table according to the structure of the table data on which the target database object in the at least one database object depends.
  • PostgreSQL and GaussDB as the target databases, PostgreSQL and GaussDB can read the data in the file through their respective table access plug-ins, and reconstruct the data in the file into a table-like table.
  • the above S406 is a specific implementation method for the target database to convert the format of the data in the file to obtain the table data on which the target database object depends in at least one database object.
  • the target database may also obtain the table data on which the target database object depends by other methods.
  • the above S404 and S406 are a specific implementation method for the target database to read the file and obtain the table data on which the target database object depends in at least one database object.
  • the target database can respond to the verification request for the target database object by reading the file and obtaining the table data on which the target database object depends in at least one database object for verifying the target database object. This can reduce resource waste and improve the efficiency of database object verification.
  • the target database runs the target database object, performs one or more of insert, delete, query, update, connect, clear, row-level lock or transaction operations on the foreign table, and obtains a first return value and a first running time.
  • the target database object can define the code for inserting, deleting, querying, updating, joining, truncating, row-level locking or transaction operations.
  • the target database runs the target database and can run the code for the above operations to perform one or more of the inserting, deleting, querying, updating, joining, truncating, row-level locking or transaction operations on the foreign table.
  • the target database may count the time of running the target database object in the target database to obtain a first running time.
  • the first running time may be determined according to a timestamp of the start of the running and a timestamp of the end of the running. For example, the first running time may be equal to the difference between the timestamp of the end of the running and the timestamp of the start of the running.
  • some operations executed by the target database also include a return value, and the target database may also obtain a first return value.
  • S410 The target database verifies the target database object according to the first return value, the first running time, and the second return value and the second running time obtained by the source database running the target database object.
  • the source database can run the target database object to perform operations on the corresponding table data, such as insert, delete, query, update, connect, clear, row-level lock or transaction operation.
  • the source database can count the time of running the above target database object in the source database, that is, the second running time. Further, the source database can obtain the return value obtained by running the above target database object in the source database, that is, the second return value.
  • the target database can compare the first return value and the second return value, and compare the first run time and the second run time, so as to verify the target database object. For example, if the first return value is equal to the second return value, it indicates that after the migration of the target database object, the target database and the source database maintain functional consistency. If the difference between the first run time and the second run time is within a preset range, it indicates that after the migration of the target database object, the target database and the source database maintain performance consistency. If the first return value and the second return value are equal, and the difference between the first run time and the second run time is within a preset range, it indicates that the target database object has passed the verification.
  • the above S408 and S410 are a specific implementation method for the target database to verify the target database object based on the table data on which the target database object depends (table data obtained by format conversion of the data in the read file, such as a foreign table).
  • the target database may also verify the target database object in other ways, and the embodiment of the present application does not limit this.
  • the present application also provides a database migration system, as shown in FIG3 , the system includes:
  • a source database 10 used to export table data on which at least one database object depends to a file
  • the target database 20 is used to read the file, obtain table data on which the target database object in the at least one database object depends, and verify the target database object according to the table data on which the target database object depends.
  • the source database 10 and the target database 20 may be implemented by hardware or software.
  • the target database 20 is used as an example for explanation below.
  • the target database 20 may be an application running on a computing device, such as PostgreSQL or GaussDB.
  • the application may be provided to users in the form of virtualized services.
  • Virtualized services may include virtual machine (VM) services, bare metal server (BMS) services, and container services.
  • VM services may be services that use virtualization technology to virtualize virtual machine (VM) resource pools on multiple physical hosts (such as computing devices) to provide users with VMs on demand for use.
  • BMS services are services that virtualize BMS resource pools on multiple physical hosts to provide users with BMS on demand for use.
  • Container services are services that virtualize container resource pools on multiple physical hosts to provide users with containers on demand for use.
  • VM is a simulated virtual computer, that is, a logical computer.
  • BMS is a high-performance computing service that can be elastically scalable, and its computing performance is no different from that of a traditional physical machine, and it has the characteristics of secure physical isolation.
  • Containers are a kernel virtualization technology that can provide lightweight virtualization to achieve the purpose of isolating user space, processes, and resources. It should be understood that the VM service, BMS service and container service in the above-mentioned virtualization services are merely specific examples. In actual applications, the virtualization service may also be other lightweight or heavyweight virtualization services, which are not specifically limited here.
  • the target database 20 may include at least one computing device, such as a server, etc.
  • the target database 20 may also be a device implemented by an application-specific integrated circuit (ASIC) or a programmable logic device (PLD).
  • ASIC application-specific integrated circuit
  • PLD programmable logic device
  • the PLD may be a complex programmable logical device (CPLD), a field-programmable gate array (FPGA), a generic array logic (GAL) or any combination thereof.
  • CPLD complex programmable logical device
  • FPGA field-programmable gate array
  • GAL generic array logic
  • the file includes a file header and data
  • the target database 20 is specifically used for:
  • the data in the file is formatted to obtain table data on which the target database object in the at least one database object depends.
  • the target database 20 performs format conversion on the data in the file to obtain table data on which the target database object in the at least one database object depends, specifically for:
  • the target database 20 verifies the target database object according to the table data on which the target database object depends, specifically for:
  • the target database object is validated against the external table.
  • the target database 20 is specifically used for:
  • Run the target database object perform one or more of insert, delete, query, update, connect, clear, row-level lock or transaction operations on the foreign table, and obtain a first return value and a first running time;
  • the target database object is verified according to the first return value, the first running time, and a second return value and a second running time obtained by the source database running the target database object.
  • the target database 20 is specifically used for:
  • a foreign table access plug-in is called to construct a foreign table according to the structure of the table data on which the target database object in the at least one database object depends.
  • the data in the file is stored in a binary byte stream format.
  • the target database 20 is specifically used for:
  • the file is read to obtain table data in the at least one database object on which the target database object depends.
  • the present application also provides a computing device 600.
  • the computing device 600 includes: a bus 602, a processor 604, a memory 606, and a communication interface 608.
  • the processor 604, the memory 606, and the communication interface 608 communicate with each other through the bus 602.
  • the computing device 600 can be a server or a terminal device. It should be understood that the present application does not limit the number of processors and memories in the computing device 600.
  • the bus 602 may be a peripheral component interconnect (PCI) bus or an extended industry standard architecture (EISA) bus.
  • the bus may be divided into an address bus, a data bus, a control bus, etc.
  • FIG6 shows only one line, but does not mean that there is only one bus or one type of bus.
  • Bus 602 Pathways for transferring information between various components of computing device 600 eg, memory 606, processor 604, communication interface 608 may be included.
  • Processor 604 may include any one or more of a central processing unit (CPU), a graphics processing unit (GPU), a microprocessor (MP), or a digital signal processor (DSP).
  • CPU central processing unit
  • GPU graphics processing unit
  • MP microprocessor
  • DSP digital signal processor
  • the memory 606 may include a volatile memory (volatile memory), such as a random access memory (RAM).
  • the memory 606 may also include a non-volatile memory (non-volatile memory), such as a read-only memory (ROM), a flash memory, a hard disk drive (HDD) or a solid state drive (SSD).
  • ROM read-only memory
  • HDD hard disk drive
  • SSD solid state drive
  • the memory 606 stores executable program code
  • the processor 604 executes the executable program code to implement the aforementioned database object verification method.
  • the memory 606 stores instructions for the database migration system to execute the database object verification method.
  • the communication interface 608 uses a transceiver module such as, but not limited to, a network interface card or a transceiver to implement communication between the computing device 600 and other devices or communication networks.
  • a transceiver module such as, but not limited to, a network interface card or a transceiver to implement communication between the computing device 600 and other devices or communication networks.
  • the embodiment of the present application also provides a computing device cluster.
  • the computing device cluster includes at least one computing device.
  • the computing device can be a server, such as a central server, an edge server, or a local server in a local data center.
  • the computing device can also be a terminal device such as a desktop computer, a laptop computer, or a smart phone.
  • the computing device cluster includes at least one computing device 600.
  • the memory 606 in one or more computing devices 600 in the computing device cluster may store the same instructions for executing the database object verification method by the database migration system.
  • one or more computing devices 600 in the computing device cluster may also be used to execute some instructions of the database migration system for executing the verification method of the database object.
  • a combination of one or more computing devices 600 may jointly execute instructions of the database migration system for executing the verification method of the database object.
  • the memory 606 in different computing devices 600 in the computing device cluster may store different instructions for executing partial functions of the database migration system.
  • FIG8 shows a possible implementation.
  • two computing devices 600A and 600B are connected via a communication interface 608.
  • the memory in computing device 600A stores instructions for executing the functions of source database 10.
  • the memory in computing device 600B stores instructions for executing the functions of target database 20.
  • the memories 606 of computing devices 600A and 600B jointly store instructions for the database migration system to execute the verification method of database objects.
  • connection mode between the computing device clusters shown in Figure 8 may be based on the fact that the verification method for the database object provided in this application requires format conversion. Therefore, it is considered that the functions implemented by the target database 20 are handed over to the computing device 600B for execution.
  • the functions of the computing device 600A shown in FIG8 may also be completed by multiple computing devices 600.
  • the functions of the computing device 600B may also be completed by multiple computing devices 600.
  • one or more computing devices in the computing device cluster can be connected via a network.
  • the network can be a wide area network or a local area network, etc.
  • FIG. 9 shows a possible implementation. As shown in FIG. 9 , two computing devices 600C and 600D are connected via a network. Specifically, the network is connected via a communication interface in each computing device.
  • the memory 606 in the computing device 600C stores instructions for executing the functions of the source database 10. At the same time, the memory 606 in the computing device 600D stores instructions for executing the functions of the target database 20.
  • connection method between the computing device clusters shown in FIG. 9 may be that considering that the verification method of the database object provided in the present application requires format conversion, it is considered that the functions implemented by the target database 20 are handed over to the computing device 600D for execution.
  • the functions of the computing device 600C shown in FIG9 may also be completed by multiple computing devices 600.
  • the functions of the computing device 600D may also be completed by multiple computing devices 600.
  • the embodiment of the present application also provides a computer-readable storage medium.
  • the computer-readable storage medium can be any available medium that can be stored by a computing device or a data storage device such as a data center that contains one or more available media.
  • the available medium can be a magnetic medium (e.g., a floppy disk, a hard disk, a tape), an optical medium (e.g., a DVD), or a semiconductor medium (e.g., a solid-state hard disk).
  • the computer-readable storage medium includes instructions that instruct the computing device to execute the above-mentioned verification method applied to the database migration system for executing database objects.
  • the present application also provides a computer program product including instructions.
  • the computer program product may be software or a program product including instructions that can be run on a computing device or stored in any available medium.
  • the method is executed on a computing device, at least one computing device executes the above-mentioned database object verification method.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本申请提供了一种数据库对象的验证方法,包括:源数据库将至少一个数据库对象依赖的表数据导出至文件,目标数据库读取该文件,获得至少一个数据库对象中目标数据库对象依赖的表数据,然后目标数据库根据目标数据库对象依赖的表数据,验证目标数据库对象。该方法缩短了数据同步过程的时间,可以提前把表数据从源数据库导出成文件,对象验证可以随时发起,无需等待表数据写入到目标数据库,由此提高了数据库对象的验证效率。而且,该方法对系统架构没有直接依赖,可应用于各种系统架构中,具有高可用性。

Description

一种数据库对象的验证方法及相关设备
本申请要求于2022年11月23日提交中国国家知识产权局、申请号为202211476960.6、发明名称为“一种数据库对象的验证方法及相关设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及数据库领域,尤其涉及一种数据处理方法、系统、计算设备集群、计算机可读存储介质、计算机程序产品。
背景技术
随着数据库领域的蓬勃发展,越来越多的数据库厂商发布了多样化的数据库产品。不同的客户可以根据自身的策略和技术要求选取符合需求的数据库。例如,业务升级时,客户可以选择新的数据库替换原有的数据库,如此需要进行数据库迁移。
数据库迁移包括数据库对象(如存储过程/函数)的迁移和迁移后对象验证。对象验证可以包括带数据验证。带数据验证具体是先将待验证的数据库对象所依赖的表数据从源数据库同步到目标数据库,创造一个和源数据库各方面条件都一致的环境,然后在目标数据库运行这个数据库对象,获取返回值和运行时间,再和源数据库的返回值和时间进行对比,以确定该数据库对象迁移后是否和源数据库库保持功能的一致性和性能的一致性。
数据同步的过程一般分为数据导出、数据转换和数据导入三个阶段,上述三个阶段全部执行成功后,可以进行带数据的对象验证环节。然而,数据从源数据库导出,再通过数据库接口写入目标数据库,效率较低,尤其当数据量较大时,可能需要数小时甚至更多时间才能同步完成。数据准备阶段耗时太长,极大地影响了对象验证的效率。
发明内容
本申请提供了一种数据库对象的验证方法,该方法提前把表数据从源数据库导出成文件,对象验证可以随时发起,无需等待表数据写入到目标数据库,由此提高了数据库对象的验证效率。本申请还提供了上述方法对应的数据库迁移系统、计算设备集群、计算机可读存储介质以及计算机程序产品。
第一方面,本申请提供一种数据库对象的验证方法。该方法可以由数据库迁移系统执行。数据库迁移系统包括源数据库和目标数据库,源数据库将至少一个数据库对象依赖的表数据导出至文件,目标数据库读取所述文件,获得所述至少一个数据库对象中目标数据库对象依赖的表数据,然后目标数据库根据所述目标数据库对象依赖的表数据,验证所述目标数据库对象。
该方法缩短了数据同步过程的时间,可以提前把表数据从源数据库导出成文件,对象验证可以随时发起,无需等待表数据写入到目标数据库,由此提高了数据库对象的验证效率。
在一些可能的实现方式中,所述文件包括文件头和数据。相应地,目标数据可以读取所述文件中的数据,然后对所述文件中的数据进行格式转换,从而获得所述至少一个数据库对象中目标数据库对象依赖的表数据。
在该方法中,目标数据库在验证目标数据库对象时,读取与该目标数据库对象对应的文件中的数据并进行格式转换,以获得目标数据库对象依赖的表数据。与直接同步表数据进行备份相比,文件可以实现快速的备份,文件中的数据可以被重复使用,实现了一次数据导出,任意次重复使用的效果。
在一些极端测试场景中,需要保证每次验证的初始数据要一致,如果直接对表数据进行同步,若数据被修改(如:被其他对象验证修改或人为修改),那么这些数据将变得不可用,需要重复执行整个数据同步过程,重新导数据,将导致验证过程及其漫长,该方法通过将数据导出至文件,可以避免重复执行整个数据同步过程,缩短了验证时间,提高了验证效率。
在一些可能的实现方式中,目标数据库可以根据所述文件中的数据,按照所述至少一个数据库对象 中目标数据库对象所依赖的表数据的结构,构建外表,从而实现重构目标数据库对象所依赖的表数据。相应地,目标数据库在验证目标数据库对象时,可以根据所述外表,验证所述目标数据库对象。
该方法通过在读取文件中的数据后进行数据格式转换,获得外表,一方面可以为数据库对象的验证提供相应数据,实现带数据验证,另一方面可以减少目标数据库资源占用。
在一些可能的实现方式中,目标数据库根据所述外表,验证所述目标数据库对象时,可以运行所述目标数据库对象,执行对所述外表的插入、删除、查询、更新、连接、清空、行级锁或事务操作中的一种或多种,获得第一返回值和第一运行时间,然后目标数据库根据所述第一返回值、第一运行时间以及所述源数据库运行所述目标数据库对象获得的第二返回值和第二运行时间,验证所述目标数据库对象。
具体地,目标数据库可以比较第一返回值、第二返回值,以及比较第一运行时间、第二运行时间,从而验证目标数据库对象。例如,第一返回值与第二返回值相等,表征该目标数据库对象迁移后,在目标数据库和源数据库保持功能的一致性。第一运行时间、第二运行时间的差值在预设范围内,则表明该目标数据库对象迁移后,在目标数据库和源数据库保持性能的一致性。第一返回值和第二返回值相等,且第一运行时间、第二运行时间的差值在预设范围内,表征目标数据库对象验证通过。
该方法通过构造外表,并在目标数据库运行目标数据库对象,以执行对外表的相应操作,可以达到如同在目标数据库的内表执行相应操作相同的效果,如此一方面可以提升验证效率,另一方面可以提高验证的准确度。
在一些可能的实现方式中,所述目标数据库可以根据所述文件中的数据,调用外表访问插件,按照所述至少一个数据库对象中目标数据库对象所依赖的表数据的结构,构建外表。
该方法通过外表访问插件构建外表,可以屏蔽数据库系统架构差异,对系统架构没有直接依赖,可应用于各种系统架构,具有较高可用性。
在一些可能的实现方式中,所述目标数据库包括PostgreSQL或GaussDB中的一种或多种。PostgreSQL或GaussDB支持开发扩展插件,如此,目标数据库可以基于开发的外表访问插件实现以外表方式访问外部文件,为基于外表验证数据库对象提供帮助。
在一些可能的实现方式中,所述文件中的数据以二进制字节流格式存储。如此可以提升表数据导出的性能,以及减少数据存储的空间占用。
在一些可能的实现方式中,目标数据库可以响应于对所述目标数据库对象的验证请求,读取所述文件,获得所述至少一个数据库对象中目标数据库对象依赖的表数据。也即,该方法可以实现按需验证相应的数据库对象,如此可以避免资源浪费,提高资源利用率。
在一些可能的实现方式中,数据库迁移系统可以集成于数据库和应用迁移(Database and Application Migration)UGO,以下简称为UGO。UGO可以实现将源数据库中的对象同步至异构的目标数据库,并支持将源数据库中至少一个数据库对象依赖的表数据导出至文件,目标数据库读取文件获得表数据,以实现对同步的数据库对象进行带数据验证。该方法对带数据对象验证的数据准备阶段进行了全新的方案设计,有效缩短了数据准备阶段的时长,由此提高了数据库对象验证的效率。而且,导出的文件通常是只读的,目标数据库可以多次读取文件中的数据部分,并根据读取的数据部分进行相应的格式转换,从而实现对不同数据库对象的验证。该方法无需重复执行整个数据同步过程,大幅度缩短了同步时间,提高了数据库对象验证的效率,满足了业务的需求。
第二方面,本申请提供一种数据库迁移系统,所述系统包括:
源数据库,用于将至少一个数据库对象依赖的表数据导出至文件;
目标数据库,用于读取所述文件,获得所述至少一个数据库对象中目标数据库对象依赖的表数据,根据所述目标数据库对象依赖的表数据,验证所述目标数据库对象。
在一些可能的实现方式中,所述文件包括文件头和数据,所述目标数据库具体用于:
读取所述文件中的数据;
对所述文件中的数据进行格式转换,获得所述至少一个数据库对象中目标数据库对象依赖的表数据。
在一些可能的实现方式中,所述目标数据库对所述文件中的数据进行格式转换,获得所述至少一个数据库对象中目标数据库对象依赖的表数据,具体用于:
根据所述文件中的数据,按照所述至少一个数据库对象中目标数据库对象所依赖的表数据的结构, 构建外表;
所述目标数据库根据所述目标数据库对象依赖的表数据,验证所述目标数据库对象,具体用于:
根据所述外表,验证所述目标数据库对象。
在一些可能的实现方式中,所述目标数据库具体用于:
运行所述目标数据库对象,执行对所述外表的插入、删除、查询、更新、连接、清空、行级锁或事务操作中的一种或多种,获得第一返回值和第一运行时间;
根据所述第一返回值、第一运行时间以及所述源数据库运行所述目标数据库对象获得的第二返回值和第二运行时间,验证所述目标数据库对象。
在一些可能的实现方式中,所述目标数据库具体用于:
根据所述文件中的数据,调用外表访问插件,按照所述至少一个数据库对象中目标数据库对象所依赖的表数据的结构,构建外表。
在一些可能的实现方式中,所述文件中的数据以二进制字节流格式存储。
在一些可能的实现方式中,所述目标数据库具体用于:
响应于对所述目标数据库对象的验证请求,读取所述文件,获得所述至少一个数据库对象中目标数据库对象依赖的表数据。
第三方面,本申请提供一种计算设备集群。所述计算设备集群包括至少一台计算设备,所述至少一台计算设备包括至少一个处理器和至少一个存储器。所述至少一个处理器、所述至少一个存储器进行相互的通信。所述至少一个处理器用于执行所述至少一个存储器中存储的指令,以使得计算设备或计算设备集群执行如第一方面或第一方面的任一种实现方式所述的数据库对象的验证方法。
第四方面,本申请提供一种计算机可读存储介质,所述计算机可读存储介质中存储有指令,所述指令指示计算设备或计算设备集群执行上述第一方面或第一方面的任一种实现方式所述的数据库对象的验证方法。
第五方面,本申请提供了一种包含指令的计算机程序产品,当其在计算设备或计算设备集群上运行时,使得计算设备或计算设备集群执行上述第一方面或第一方面的任一种实现方式所述的数据库对象的验证方法。
本申请在上述各方面提供的实现方式的基础上,还可以进行进一步组合以提供更多实现方式。
附图说明
为了更清楚地说明本申请实施例的技术方法,下面将对实施例中所需使用的附图作以简单地介绍。
图1为本申请实施例提供的一种数据库迁移工具进行数据库迁移的流程示意图;
图2为本申请实施例提供的一种数据库迁移工具进行数据同步的流程示意图;
图3为本申请实施例提供的一种数据库迁移系统的架构示意图;
图4为本申请实施例提供的一种数据库对象的验证方法的流程示意图;
图5为本申请实施例提供的一种数据导出的示意图;
图6为本申请实施例提高的一种计算设备的结构示意图;
图7为本申请实施例提供的一种计算设备集群的结构示意图;
图8为本申请实施例提供的另一种计算设备集群的结构示意图;
图9为本申请实施例提供的又一种计算设备集群的结构示意图。
具体实施方式
本申请实施例中的术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括一个或者更多个该特征。
首先对本申请实施例中所涉及到的一些技术术语进行介绍。
数据库,是以一定方式储存在一起、能予多个用户共享、具有尽可能小的冗余度、与应用程序(简称为应用)彼此独立的数据集合。一个数据库可以包括多个表空间(Tablespace)。数据库通常可以由数 据库管理系统(Database Management System,DBMS)来控制,实现数据库中数据的增加(即插入)、删除、查询或修改(即更新)等操作。在现实中,数据、DBMS一起被称为数据库系统,通常简称为数据库。
数据库迁移,是指将数据从源数据库转移到目标数据库的已配置或实现的过程,在转移期间可能会转换数据。例如,源数据库和目标数据库为异构数据库时,在转移期间可以将源数据库中的转移数据转换为与目标数据库适配的数据。
数据库迁移通常包括数据库对象的迁移、迁移后的对象验证、数据迁移、数据比对、应用迁移。其中,数据库对象是数据库的组成部分。数据库对象包括但不限于表、缺省值、索引、存储过程、触发器、函数、用户、或视图。
表是数据库中的一种特殊数据结构,用于存储数据对象以及对象之间的关系,由行和列组成。缺省值是当在表中创建列或插入数据时,对没有指定其具体值的列或列数据项赋予事先设定好的值。索引是对数据库表中一列或多列的值进行排序的一种结构,使用索引可快速访问数据库表中的特定信息。
存储过程是一组为了完成特定功能的SQL语句的集合。一般用于报表统计、数据迁移等。触发器是一种特殊类型的存储过程,通过指定的事件触发执行。一般用于数据审计、数据备份等函数是对一些业务逻辑的封装,以完成特定的功能。函数执行完成后可以返回执行结果。
用户(user)通常是对数据库有权限访问的人。视图(view)是从一个或几个基本表中导出的虚表,可用于控制用户对数据访问。视图也有一组数据项和命名字段,上述数据项和命名字段在用户执行查询操作的时候出现,其实在数据库中并不存在,通过控制用户对数据的访问权限,简化数据,只显示用户需要的数据项。
数据库迁移通常可以由数据库迁移工具实现。为了便于理解,本申请提供了一个数据库迁移工具进行数据库迁移的流程示意图。
如图1所示,源数据库连接有上层应用,具体为应用1至应用n。当业务升级或其他原因导致需要将数据由源数据库迁移至目标数据库时,数据库迁移工具可以从源数据库(也称作源库)进行对象抽取获得数据库对象,然后解析数据库对象,对数据库对象进行转换,例如源数据库和目标数据库(也称作目标库)的SQL语法本身存在差异,所以在迁移的过程中需要将源库的SQL语句进行语法重构,转换成与目标库对等的SQL,并在目标库执行。在迁移数据库对象后,通常需要对数据库对象进行验证,保证源数据库、目标数据库的数据库对象在功能、性能、行为表现上一致。
迁移过程中,数据库迁移工具还将源数据库的表数据同步至目标数据库。类似地,在数据迁移后,数据库迁移工具还比对源数据库和目标数据库的数据是否一致。进一步地,客户的应用中会使用大量的SQL语句访问数据库,这些SQL语句也需要转换成与目标数据库匹配的SQL,这个过程首先需要能够发现应用中存在的SQL语句,然后再对这些SQL语句进行转换,最后对应用系统代码进行改写。
其中,迁移后对象验证过程比较复杂,尤其是对存储过程、函数等编程对象的验证需要耗费大量的时间。这些对象通常具有复杂的业务逻辑,并且依赖其他数据库对象(如:表、序列、视图…等)和数据。如果要对其做充分的验证,保证功能和性能和源数据库一致,那么带数据验证是必不可少的一部分。
带数据验证就是先将待验证的数据库对象(如存储过程/函数)所依赖的表数据从源数据库同步到目标数据库,创造一个和源数据库各方面条件都一致的环境,然后在目标数据库运行这个数据库对象,获取返回值和运行时间,再和源数据库的返回值和时间进行对比,以确定该对象迁移后是否和源数据库保持功能和性能的一致性。
目前,数据库迁移工具进行数据同步的过程一般分为数据导出、数据转换和数据导入三个阶段,如图2所示,在这三个阶段全部执行成功后,才能进行带数据验证环节。然而,数据从源数据库导出,再通过数据库接口写入目标数据库,效率较低,尤其当数据量较大时,可能需要数小时甚至更多时间才能同步完成。数据准备阶段耗时太长,极大地影响了对象验证的效率。
有鉴于此,本申请提供一种数据库对象的验证方法。在该方法中,源数据库将至少一个数据库对象依赖的表数据导出至文件,目标数据库读取所述文件,获得所述至少一个数据库对象中目标数据库对象依赖的表数据,然后根据所述目标数据库对象依赖的表数据,验证所述目标数据库对象。
该方法缩短了数据同步过程的时间,可以提前把表数据从源数据库导出成文件,对象验证可以随时 发起,无需等待表数据写入到目标数据库,由此提高了数据库对象的验证效率。而且,该方法对系统架构没有直接依赖,可应用于各种系统架构中,例如,该方法可以应用于目标数据库为PostgreSQL或GaussDB的场景中,具有较高可用性。
为了使得本申请的技术方案更加清楚、易于理解,下面结合附图对本申请的系统架构进行介绍。
参见图3所示的数据库迁移系统的架构示意图,该系统包括源数据库10和目标数据库20。源数据库10的数据库对象可以同步至目标数据库20。其中,源数据库10还用于将存储的表数据,如至少一个数据库对象依赖的表数据导出至文件。目标数据库用于读取文件,获得所述至少一个数据库对象中目标数据库对象依赖的表数据,例如目标数据库可以通过外表访问插件读取文件,获得至少一个数据库对象中目标数据库对象依赖的表数据,而不必经过数据库存储,然后目标数据库可以根据目标数据库对象依赖的表数据,验证目标数据库对象。
该系统中将数据同步的三个阶段缩短为一个阶段,只保留数据导出,将导出的数据存储为特定数据格式的文件中,而不直接写入到目标数据库,缩短了数据同步的时间,可以提前把数据从源数据库导出至文件,对象验证可以随时发起,无需等待数据写入到目标数据库。
进一步地,目标数据库为支持开发扩展插件的数据库,包括但不限于PostgreSQL、GaussDB。针对目标数据库开发外表访问插件,可以使得目标数据库通过外表访问插件,实现直接以表的方式访问外部文件,即插件在读取文件中的数据时进行数据转换,以获得用于数据库对象验证的表数据。如此,用于验证数据库对象的表数据可以重复使用,实现了一次数据导出,任意次重复使用的效果,解决了在一些极端测试场景中,为保证每次验证的初始数据要一致,传统数据迁移工具将表数据同步成功后,如果表数据被修改(如:被其他对象验证修改或人为修改),那么这些数据将变得不可用,需要重复执行整个数据同步过程,重新导入数据,将导致验证过程及其漫长的问题。
图3对本申请的数据库迁移系统的架构进行了介绍,接下来,结合附图对本申请实施例提供的数据库迁移过程中一种数据库对象的验证方法进行介绍。
参见图4所示的数据库对象的验证方法的交互流程图,该方法包括如下步骤:
S402:源数据库将至少一个数据库对象依赖的表数据导出至文件。
数据库对象是数据库的组成部分,包括但不限于存储过程、函数。存储过程、函数等数据库对象的运行依赖表数据。其中,一个数据库对象可以依赖至少一个表数据,不同数据库对象可以依赖不同的表数据,例如数据库对象A依赖表数据1,数据库对象B依赖表数据2。在一些实施例中,不同数据库对象依赖的数据可以存在交集,例如数据库对象A和数据库对象B还依赖表数据3,也即数据库对象A依赖表数据1和表数据3,数据库对象B依赖表数据2和表数据3。
在数据库迁移场景中,当数据库对象由源数据库同步至目标数据库,源数据库可以将至少一个数据库对象依赖的表数据导出至文件。该文件为独立于目标数据库的外部文件。源数据库可以将数据导出形成一套规范的逻辑,从而将导出的数据按照统一的格式进行存储。考虑导出的性能和数据存储的空间占用,数据的存储格式可以直接以二进制字节流进行存储。
参见图5所示的数据导出的示意图,每个表数据(每张表)可以导出为一个文件。文件包括文件头和数据。其中,文件头存储表数据的元数据,例如是列名,上述元数据可以用于表征表数据的结构(也可以简称为表结构)。文件头还可以文件的元数据,例如是导出时间、最近一次修改时间、创建者中的一种或多种。数据部分可以包括各列的值。为了便于查找,数据部分的每行可以存储各列长度和相应的值。考虑到行长度可以不一致,数据部分的每行还可以存储行长度。也就是,每行可以存储行长度、列1长度、列1值、列2长度、列2值……列N长度、列N值。其中,N为正整数。
S404:目标数据库读取文件中的数据。
目标数据库可以识别文件中的文件头和数据部分,然后读取文件中的数据部分。具体地,目标数据库可以通过外表访问插件,读取文件中的数据部分。
S406:目标数据库根据所述文件中的数据,按照所述至少一个数据库对象中目标数据库对象所依赖的表数据的结构,构建外表。
目标数据库对象可以是源数据库同步至目标数据库的数据库对象中的一个或多个数据库对象。为了实现对目标数据库对象的验证,目标数据库可以根据文件数据中的数据,按照目标数据库对象所依赖的表数据的结构,构建外表。
具体地,目标数据库可以获取表数据的多个列名,根据多个列名初始化表数据,针对文件的数据部分中的每行数据,将各列的列值分别赋予初始化的表数据中相应行中各列对应的表项,由此实现构建外表。
在一些可能的实现方式中,在构建外表时,目标数据库可以根据所述文件中的数据,调用外表访问插件,按照所述至少一个数据库对象中目标数据库对象所依赖的表数据的结构,构建外表。以目标数据库为PostgreSQL、GaussDB为例,PostgreSQL、GaussDB可以通过各自的外表访问插件,读取文件中的数据,并将文件中的数据重构为表形式的外表。
需要说明的是,上述S406为目标数据库对文件中的数据进行格式转换,获得至少一个数据库对象中目标数据库对象依赖的表数据的一种具体实现方式,在本申请实施例其他可能的实现方式中,目标数据库也可以通过其他方式获得目标数据库对象依赖的表数据。
进一步地,上述S404、S406为目标数据库读取文件,获得至少一个数据库对象中目标数据库对象依赖的表数据的一种具体实现方式。目标数据库可以响应于对所述目标数据库对象的验证请求,读取所述文件,获得所述至少一个数据库对象中目标数据库对象依赖的表数据,以用于验证目标数据库对象。如此可以减少资源浪费,提高数据库对象验证的效率。
S418:目标数据库运行所述目标数据库对象,执行对所述外表的插入、删除、查询、更新、连接、清空、行级锁或事务操作中的一种或多种,获得第一返回值和第一运行时间。
目标数据库对象可以定义插入、删除、查询、更新、连接(join)、清空(Truncate)、行级锁或事务操作的代码,目标数据库运行目标数据库,可以运行上述操作的代码,从而执行对外表的插入、删除、查询、更新、连接、清空、行级锁或事务操作中的一种或多种。
目标数据库可以统计在目标数据库运行目标数据库对象的时间,获得第一运行时间。第一运行时间可以根据运行开始的时间戳和运行结束的时间戳确定,例如第一运行时间可以等于运行结束的时间戳与运行开始的时间戳的差值。进一步地,目标数据库执行的一些操作还包括返回值,目标数据库还可以获得第一返回值。
S410:目标数据库根据所述第一返回值、第一运行时间以及所述源数据库运行所述目标数据库对象获得的第二返回值和第二运行时间,验证目标数据库对象。
与目标数据库类似,源数据库可以运行目标数据库对象,以执行对相应表数据的操作,例如是插入、删除、查询、更新、连接、清空、行级锁或事务操作。源数据库可以统计在源数据库运行上述目标数据库对象的时间,即第二运行时间。进一步地,源数据库可以获得在源数据库运行上述目标数据库对象所得的返回值,即第二返回值。
目标数据库可以比较第一返回值、第二返回值,以及比较第一运行时间、第二运行时间,从而验证目标数据库对象。例如,第一返回值与第二返回值相等,表征该目标数据库对象迁移后,在目标数据库和源数据库保持功能的一致性。第一运行时间、第二运行时间的差值在预设范围内,则表明该目标数据库对象迁移后,在目标数据库和源数据库保持性能的一致性。第一返回值和第二返回值相等,且第一运行时间、第二运行时间的差值在预设范围内,表征目标数据库对象验证通过。
上述S408、S410为目标数据库根据目标数据库对象依赖的表数据(读取文件中的数据进行格式转换所得的表数据,例如是外表),验证所述目标数据库对象的一种具体实现方式,在本申请实施例其他可能的实现方式中,目标数据库也可以通过其他方式实现对目标数据库对象的验证,本申请实施例对此不作限制。
基于上述方法实施例,本申请还提供一种数据库迁移系统,如图3所示,该系统包括:
源数据库10,用于将至少一个数据库对象依赖的表数据导出至文件;
目标数据库20,用于读取所述文件,获得所述至少一个数据库对象中目标数据库对象依赖的表数据,根据所述目标数据库对象依赖的表数据,验证所述目标数据库对象。
示例性地,上述源数据库10、目标数据库20可以通过硬件实现,或者可以通过软件实现。为了便于描述,下面以目标数据库20示例说明。
其中,当通过软件实现时,目标数据库20可以是运行在计算设备上的应用程序,如PostgreSQL或GaussDB等。该应用程序可以以虚拟化服务的方式提供给用户使用。虚拟化服务可以包括虚拟机(virtual machine,VM)服务、裸金属服务器(bare metal server,BMS)服务以及容器(container)服务。其中,VM服务可以是通过虚拟化技术在多个物理主机(如计算设备)上虚拟出虚拟机(virtual machine,VM)资源池以为用户按需提供VM进行使用的服务。BMS服务是在多个物理主机上虚拟出BMS资源池以为用户按需提供BMS进行使用的服务。容器服务是在多个物理主机上虚拟出容器资源池以为用户按需提供容器进行使用的服务。VM是模拟出来的一台虚拟的计算机,也即逻辑上的一台计算机。BMS是一种可弹性伸缩的高性能计算服务,计算性能与传统物理机无差别,具有安全物理隔离的特点。容器是一种内核虚拟化技术,可以提供轻量级的虚拟化,以达到隔离用户空间、进程和资源的目的。应理解,上述虚拟化服务中的VM服务、BMS服务以及容器服务仅仅是作为具体的示例,在实际应用中,虚拟化服务还可以是其他轻量级或者重量级的虚拟化服务,此处不作具体限定。
当通过硬件实现时,目标数据库20中可以包括至少一个计算设备,如服务器等。或者,目标数据库20也可以是利用专用集成电路(application-specific integrated circuit,ASIC)实现、或可编程逻辑器件(programmable logic device,PLD)实现的设备等。其中,上述PLD可以是复杂程序逻辑器件(complex programmable logical device,CPLD)、现场可编程门阵列(field-programmable gate array,FPGA)、通用阵列逻辑(generic array logic,GAL)或其任意组合实现。
在一些可能的实现方式中,所述文件包括文件头和数据,所述目标数据库20具体用于:
读取所述文件中的数据;
对所述文件中的数据进行格式转换,获得所述至少一个数据库对象中目标数据库对象依赖的表数据。
在一些可能的实现方式中,所述目标数据库20对所述文件中的数据进行格式转换,获得所述至少一个数据库对象中目标数据库对象依赖的表数据,具体用于:
根据所述文件中的数据,按照所述至少一个数据库对象中目标数据库对象所依赖的表数据的结构,构建外表;
所述目标数据库20根据所述目标数据库对象依赖的表数据,验证所述目标数据库对象,具体用于:
根据所述外表,验证所述目标数据库对象。
在一些可能的实现方式中,所述目标数据库20具体用于:
运行所述目标数据库对象,执行对所述外表的插入、删除、查询、更新、连接、清空、行级锁或事务操作中的一种或多种,获得第一返回值和第一运行时间;
根据所述第一返回值、第一运行时间以及所述源数据库运行所述目标数据库对象获得的第二返回值和第二运行时间,验证所述目标数据库对象。
在一些可能的实现方式中,所述目标数据库20具体用于:
根据所述文件中的数据,调用外表访问插件,按照所述至少一个数据库对象中目标数据库对象所依赖的表数据的结构,构建外表。
在一些可能的实现方式中,所述文件中的数据以二进制字节流格式存储。
在一些可能的实现方式中,所述目标数据库20具体用于:
响应于对所述目标数据库对象的验证请求,读取所述文件,获得所述至少一个数据库对象中目标数据库对象依赖的表数据。
本申请还提供一种计算设备600。如图6所示,计算设备600包括:总线602、处理器604、存储器606和通信接口608。处理器604、存储器606和通信接口608之间通过总线602通信。计算设备600可以是服务器或终端设备。应理解,本申请不限定计算设备600中的处理器、存储器的个数。
总线602可以是外设部件互连标准(peripheral component interconnect,PCI)总线或扩展工业标准结构(extended industry standard architecture,EISA)总线等。总线可以分为地址总线、数据总线、控制总线等。为便于表示,图6中仅用一条线表示,但并不表示仅有一根总线或一种类型的总线。总线602 可包括在计算设备600各个部件(例如,存储器606、处理器604、通信接口608)之间传送信息的通路。
处理器604可以包括中央处理器(central processing unit,CPU)、图形处理器(graphics processing unit,GPU)、微处理器(micro processor,MP)或者数字信号处理器(digital signal processor,DSP)等处理器中的任意一种或多种。
存储器606可以包括易失性存储器(volatile memory),例如随机存取存储器(random access memory,RAM)。存储器606还可以包括非易失性存储器(non-volatile memory),例如只读存储器(read-only memory,ROM),快闪存储器,机械硬盘(hard disk drive,HDD)或固态硬盘(solid state drive,SSD)。存储器606中存储有可执行的程序代码,处理器604执行该可执行的程序代码以实现前述数据库对象的验证方法。具体的,存储器606上存有数据库迁移系统用于执行数据库对象的验证方法的指令。
通信接口608使用例如但不限于网络接口卡、收发器一类的收发模块,来实现计算设备600与其他设备或通信网络之间的通信。
本申请实施例还提供了一种计算设备集群。该计算设备集群包括至少一台计算设备。该计算设备可以是服务器,例如是中心服务器、边缘服务器,或者是本地数据中心中的本地服务器。在一些实施例中,计算设备也可以是台式机、笔记本电脑或者智能手机等终端设备。
如图7所示,所述计算设备集群包括至少一个计算设备600。计算设备集群中的一个或多个计算设备600中的存储器606中可以存有相同的、数据库迁移系统用于执行数据库对象的验证方法的指令。
在一些可能的实现方式中,该计算设备集群中的一个或多个计算设备600也可以用于执行数据库迁移系统用于执行数据库对象的验证方法的部分指令。换言之,一个或多个计算设备600的组合可以共同执行数据库迁移系统用于执行数据库对象的验证方法的指令。
需要说明的是,计算设备集群中的不同的计算设备600中的存储器606可以存储不同的指令,用于执行数据库迁移系统的部分功能。
图8示出了一种可能的实现方式。如图8所示,两个计算设备600A和600B通过通信接口608实现连接。计算设备600A中的存储器上存有用于执行源数据库10的功能的指令。计算设备600B中的存储器上存有用于执行目标数据库20的功能的指令。换言之,计算设备600A和600B的存储器606共同存储了数据库迁移系统用于执行数据库对象的验证方法的指令。
图8所示的计算设备集群之间的连接方式可以是考虑到本申请提供的数据库对象的验证方法需要进行格式转换。因此,考虑将目标数据库20实现的功能交由计算设备600B执行。
应理解,图8中示出的计算设备600A的功能也可以由多个计算设备600完成。同样,计算设备600B的功能也可以由多个计算设备600完成。
在一些可能的实现方式中,计算设备集群中的一个或多个计算设备可以通过网络连接。其中,所述网络可以是广域网或局域网等等。图9示出了一种可能的实现方式。如图9所示,两个计算设备600C和600D之间通过网络进行连接。具体地,通过各个计算设备中的通信接口与所述网络进行连接。在这一类可能的实现方式中,计算设备600C中的存储器606中存有执行源数据库10的功能的指令。同时,计算设备600D中的存储器606中存有执行目标数据库20的功能的指令。
图9所示的计算设备集群之间的连接方式可以是考虑到本申请提供的数据库对象的验证方法需要进行格式转换,因此考虑将目标数据库20实现的功能交由计算设备600D执行。
应理解,图9中示出的计算设备600C的功能也可以由多个计算设备600完成。同样,计算设备600D的功能也可以由多个计算设备600完成。
本申请实施例还提供了一种计算机可读存储介质。所述计算机可读存储介质可以是计算设备能够存储的任何可用介质或者是包含一个或多个可用介质的数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质(例如固态硬盘)等。该计算机可读存储介质包括指令,所述指令指示计算设备执行上述应用于数据库迁移系统用于执行数据库对象的验证方法。
本申请实施例还提供了一种包含指令的计算机程序产品。所述计算机程序产品可以是包含指令的,能够运行在计算设备上或被储存在任何可用介质中的软件或程序产品。当所述计算机程序产品在至少一 个计算设备上运行时,使得至少一个计算设备执行上述数据库对象的验证方法。
最后应说明的是:以上实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的保护范围。

Claims (17)

  1. 一种数据库对象的验证方法,其特征在于,所述方法包括:
    源数据库将至少一个数据库对象依赖的表数据导出至文件;
    目标数据库读取所述文件,获得所述至少一个数据库对象中目标数据库对象依赖的表数据;
    目标数据库根据所述目标数据库对象依赖的表数据,验证所述目标数据库对象。
  2. 根据权利要求1所述的方法,其特征在于,所述文件包括文件头和数据,所述目标数据库读取所述文件,获得所述至少一个数据库对象中目标数据库对象依赖的表数据,包括:
    所述目标数据库读取所述文件中的数据;
    所述目标数据库对所述文件中的数据进行格式转换,获得所述至少一个数据库对象中目标数据库对象依赖的表数据。
  3. 根据权利要求2所述的方法,其特征在于,所述目标数据库对所述文件中的数据进行格式转换,获得所述至少一个数据库对象中目标数据库对象依赖的表数据,包括:
    所述目标数据库根据所述文件中的数据,按照所述至少一个数据库对象中目标数据库对象所依赖的表数据的结构,构建外表;
    所述目标数据库根据所述目标数据库对象依赖的表数据,验证所述目标数据库对象,包括:
    所述目标数据库根据所述外表,验证所述目标数据库对象。
  4. 根据权利要求3所述的方法,其特征在于,所述目标数据库根据所述外表,验证所述目标数据库对象,包括:
    所述目标数据库运行所述目标数据库对象,执行对所述外表的插入、删除、查询、更新、连接、清空、行级锁或事务操作中的一种或多种,获得第一返回值和第一运行时间;
    所述目标数据库根据所述第一返回值、第一运行时间以及所述源数据库运行所述目标数据库对象获得的第二返回值和第二运行时间,验证所述目标数据库对象。
  5. 根据权利要求3所述的方法,其特征在于,所述目标数据库根据所述文件中的数据,按照所述至少一个数据库对象中目标数据库对象所依赖的表数据的结构,构建外表,包括:
    所述目标数据库根据所述文件中的数据,调用外表访问插件,按照所述至少一个数据库对象中目标数据库对象所依赖的表数据的结构,构建外表。
  6. 根据权利要求1至5任一项所述的方法,其特征在于,所述文件中的数据以二进制字节流格式存储。
  7. 根据权利要求1至6任一项所述的方法,其特征在于,所述目标数据库读取所述文件,获得所述至少一个数据库对象中目标数据库对象依赖的表数据,包括:
    所述目标数据库响应于对所述目标数据库对象的验证请求,读取所述文件,获得所述至少一个数据库对象中目标数据库对象依赖的表数据。
  8. 一种数据库迁移系统,其特征在于,所述系统包括:
    源数据库,用于将至少一个数据库对象依赖的表数据导出至文件;
    目标数据库,用于读取所述文件,获得所述至少一个数据库对象中目标数据库对象依赖的表数据,根据所述目标数据库对象依赖的表数据,验证所述目标数据库对象。
  9. 根据权利要求8所述的系统,其特征在于,所述文件包括文件头和数据,所述目标数据库具体用于:
    读取所述文件中的数据;
    对所述文件中的数据进行格式转换,获得所述至少一个数据库对象中目标数据库对象依赖的表数据。
  10. 根据权利要求9所述的系统,其特征在于,所述目标数据库对所述文件中的数据进行格式转换,获得所述至少一个数据库对象中目标数据库对象依赖的表数据,具体用于:
    根据所述文件中的数据,按照所述至少一个数据库对象中目标数据库对象所依赖的表数据的结构,构建外表;
    所述目标数据库根据所述目标数据库对象依赖的表数据,验证所述目标数据库对象,具体用于:
    根据所述外表,验证所述目标数据库对象。
  11. 根据权利要求10所述的系统,其特征在于,所述目标数据库具体用于:
    运行所述目标数据库对象,执行对所述外表的插入、删除、查询、更新、连接、清空、行级锁或事务操作中的一种或多种,获得第一返回值和第一运行时间;
    根据所述第一返回值、第一运行时间以及所述源数据库运行所述目标数据库对象获得的第二返回值和第二运行时间,验证所述目标数据库对象。
  12. 根据权利要求10所述的系统,其特征在于,所述目标数据库具体用于:
    根据所述文件中的数据,调用外表访问插件,按照所述至少一个数据库对象中目标数据库对象所依赖的表数据的结构,构建外表。
  13. 根据权利要求8至12任一项所述的系统,其特征在于,所述文件中的数据以二进制字节流格式存储。
  14. 根据权利要求1至6任一项所述的系统,其特征在于,所述目标数据库具体用于:
    响应于对所述目标数据库对象的验证请求,读取所述文件,获得所述至少一个数据库对象中目标数据库对象依赖的表数据。
  15. 一种计算设备集群,其特征在于,所述计算设备集群包括至少一台计算设备,所述至少一台计算设备包括至少一个处理器和至少一个存储器,所述至少一个存储器中存储有计算机可读指令;所述至少一个处理器执行所述计算机可读指令,以使得所述计算设备集群执行如权利要求1至7中任一项所述的方法。
  16. 一种计算机可读存储介质,其特征在于,包括计算机可读指令;所述计算机可读指令用于实现权利要求1至7任一项所述的方法。
  17. 一种计算机程序产品,其特征在于,包括计算机可读指令;所述计算机可读指令用于实现权利要求1至7任一项所述的方法。
PCT/CN2023/100966 2022-11-23 2023-06-19 一种数据库对象的验证方法及相关设备 WO2024108994A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211476960.6 2022-11-23
CN202211476960.6A CN118113679A (zh) 2022-11-23 2022-11-23 一种数据库对象的验证方法及相关设备

Publications (1)

Publication Number Publication Date
WO2024108994A1 true WO2024108994A1 (zh) 2024-05-30

Family

ID=91195088

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/100966 WO2024108994A1 (zh) 2022-11-23 2023-06-19 一种数据库对象的验证方法及相关设备

Country Status (2)

Country Link
CN (1) CN118113679A (zh)
WO (1) WO2024108994A1 (zh)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104504122A (zh) * 2014-12-29 2015-04-08 浪潮(北京)电子信息产业有限公司 一种数据库迁移数据的验证方法及系统
CN107958057A (zh) * 2017-11-29 2018-04-24 苏宁云商集团股份有限公司 一种用于异构数据库中数据迁移的代码生成方法及装置
US20190079929A1 (en) * 2017-09-12 2019-03-14 Facebook, Inc. Migrating across database deployments
CN113392090A (zh) * 2021-06-29 2021-09-14 未鲲(上海)科技服务有限公司 基于数据库迁移的数据验证方法、装置、设备及介质

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104504122A (zh) * 2014-12-29 2015-04-08 浪潮(北京)电子信息产业有限公司 一种数据库迁移数据的验证方法及系统
US20190079929A1 (en) * 2017-09-12 2019-03-14 Facebook, Inc. Migrating across database deployments
CN107958057A (zh) * 2017-11-29 2018-04-24 苏宁云商集团股份有限公司 一种用于异构数据库中数据迁移的代码生成方法及装置
CN113392090A (zh) * 2021-06-29 2021-09-14 未鲲(上海)科技服务有限公司 基于数据库迁移的数据验证方法、装置、设备及介质

Also Published As

Publication number Publication date
CN118113679A (zh) 2024-05-31

Similar Documents

Publication Publication Date Title
US11163739B2 (en) Database table format conversion based on user data access patterns in a networked computing environment
CN109074387B (zh) 分布式数据存储区中的版本化分层数据结构
US6606618B2 (en) Method for optimizing the performance of a database
US7676481B2 (en) Serialization of file system item(s) and associated entity(ies)
US9990391B1 (en) Transactional messages in journal-based storage systems
US10108658B1 (en) Deferred assignments in journal-based storage systems
US11048669B2 (en) Replicated state management using journal-based registers
US11327905B2 (en) Intents and locks with intent
US10915551B2 (en) Change management for shared objects in multi-tenancy systems
US11907260B2 (en) Compare processing using replication log-injected compare records in a replication environment
CN113297320A (zh) 分布式数据库系统及数据处理方法
US20180284999A1 (en) Data Migration with Application-Native Export and Import Capabilities
US12013814B2 (en) Managing snapshotting of a dataset using an ordered set of B+ trees
CN117421302A (zh) 一种数据处理方法及相关设备
WO2024108994A1 (zh) 一种数据库对象的验证方法及相关设备
US11188228B1 (en) Graphing transaction operations for transaction compliance analysis
CN113760902A (zh) 数据拆分方法、装置、设备、介质及程序产品
CN112889039A (zh) 用于克隆后租户标识符转换的记录的标识
US20110191549A1 (en) Data Array Manipulation
Marchioni MongoDB for Java developers
CN113761040A (zh) 数据库与应用程序双向映射方法、设备、介质及程序产品
WO2023246188A1 (zh) 一种数据共享方法及相关系统
CN110647535A (zh) 一种将业务数据更新至Hive的方法、终端及存储介质
Singh Survey of NoSQL Database Engines for Big Data
WO2024011932A1 (zh) 一种文件管理方法及相关设备