CN115129717A - Data writing method, device and computer readable medium for realizing automatic partition - Google Patents

Data writing method, device and computer readable medium for realizing automatic partition Download PDF

Info

Publication number
CN115129717A
CN115129717A CN202210736747.8A CN202210736747A CN115129717A CN 115129717 A CN115129717 A CN 115129717A CN 202210736747 A CN202210736747 A CN 202210736747A CN 115129717 A CN115129717 A CN 115129717A
Authority
CN
China
Prior art keywords
partition
data
sub
target data
updating
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210736747.8A
Other languages
Chinese (zh)
Inventor
裘春荣
李昕
曾亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Kunyao Network Technology Co ltd
Original Assignee
Shanghai Kunyao Network Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Kunyao Network Technology Co ltd filed Critical Shanghai Kunyao Network Technology Co ltd
Priority to CN202210736747.8A priority Critical patent/CN115129717A/en
Publication of CN115129717A publication Critical patent/CN115129717A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/252Integrating or interfacing systems involving database management systems between a Database Management System and a front-end application

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The scheme can create a declarative partition main table, a data insertion rule, a data updating rule, an insertion function and an updating function in advance, bind the insertion function with the data insertion rule, bind the updating function with the data updating rule, call the insertion function based on the data insertion rule when a data insertion statement is detected, complete the insertion of target data by executing the insertion function to replace the data insertion statement, call the updating function based on the data updating rule when the data updating statement is detected, and complete the updating of the target data by executing the updating function to replace the data updating statement. Thus, while maintaining the characteristic of the database statement type partition table, the flow of data writing can be changed by using the rule, the execution sequence limitation of the partition table trigger is avoided, and the data writing can be smoothly completed even under the condition that the partition sublist is not created in advance.

Description

Data writing method, device and computer readable medium for realizing automatic partition
Technical Field
The present application relates to the field of information technology, and in particular, to a bilingual corpus detection method, device, and computer-readable medium.
Background
Modern databases all support partitioned tables instead of single tables to store ever-increasing amounts of data. Such a large amount of data that can be partitioned generally follows a certain business rule, for example, a data set is continuously expanded according to time, a self-increment sequence, and the like. If the mass data is stored in the single table, the effective index of the mass data can be continuously increased along with the continuous expansion of the data, so that the index originally created for improving the retrieval efficiency is continuously increased along with the increase of the mass data, and the actual query and use efficiency is reduced. Modern databases, particularly the PostgreSQL database, provide a way to specify how a table is partitioned into segments called partitions. The partitioned table is referred to as a partition table. The data is stored in a table partition mode, and multi-table range parallel query of a partition table can be carried out according to query conditions; the insertion efficiency is accelerated through table partitioning, the single table index volume is reduced, and the data retrieval efficiency is improved.
PostgreSQL declarative partitioning semantically implements many table partitioning functions with many advantageous features, such as partition auto-indexing, if an index is created on a partition table, PostgreSQL automatically creates an index with the same attributes for each partition. Therefore, the maintenance cost of the partition table is greatly reduced; for example, intelligent partition connection, for two partition tables with the same partition mode, if equivalent connection is performed by using a partition key, direct connection operation is allowed to be performed by using a matched partition.
The declarative partition transaction flow does not allow data to be created before it is inserted into a non-existent partition in the same transaction, and the database engine collects the whole partition table information during the beginning of inserting new data; the declarative partition table has a built-in routing behavior that occurs after a pre-insertion trigger at the statement level. It will INSERT and UPDATE the actual INSERT/UPDATE actions into the partition according to the partition column. At this time, the main partition table does not trigger its row-level trigger, but triggers the row-level trigger in the partition table, and does not trigger the statement-level trigger of the partition table. When a new partition cannot be dynamically created and the automatic routing stage is reached, the insertion of the non-existing partition can cause the insertion failure, and the subsequent action is not executed any more. The automatic creation of a partition cannot be achieved by creating a trigger in the partition.
Currently, there are various schemes for using PostgreSQL database data partitioning, such as creating a batch of partition sub-tables in advance, using non-database self-declared partitions, using third-party plug-ins, and the like. These approaches are problematic due to the existence of the above mechanisms:
one, a scheme of creating a batch of partition sub-tables in advance.
The biggest disadvantage of this approach is that it needs to create new partition sub-tables periodically to adapt to the new data range partitions, the partition sub-tables cannot create partition sub-tables according to the data partitions where the actual database data are located, and all data ranges must be created well in advance and exist closely, so that corresponding data cannot be missed into the partitions.
Secondly, manually creating partition sub-tables is not suitable for intensive partitions, for example, partitioning sub-tables by minutes, and manual operation becomes a possible non-persistent implementation, which brings additional downtime overhead for subsequent sustainable maintenance.
And secondly, a scheme of non-database self-declared partition is used.
The characteristics of PostgreSQL are such that it is not possible to determine whether a new partition needs to be created by a partition table trigger while inserting data within the same transaction context. A set of mechanisms needs to be established separately to satisfy the function of automatic partitioning, so that the addition of the native declarative partitioning property of the database cannot be used directly.
Some additional adaptation work needs to be done to adapt to the dynamic creation of different partition sub-tables, the update of the partition sub-table logic depends on the update of the program, the new data can be continuously filled depending on the length of the maintenance time overhead, and the program is required to realize a series of characteristics in the database partition.
Third, scheme using third-party plug-in
The hidden danger of upgrading exists depending on the third-party plug-in, and the production environment can only depend on a specific version. For example, pg _ pathman is a third party plug-in to the PostgreSQL database. The project is currently in a maintenance phase and the authorities expect to stop subsequent support of the PostgreSQL database.
Therefore, in the prior art, no scheme which can smoothly complete data writing without creating a partition sublist in advance and simultaneously retain the characteristics of a database statement type partition table exists.
Disclosure of Invention
An object of the present application is to provide a data writing method for implementing automatic partitioning, so as to solve the problem that the writing of data cannot be successfully completed without creating a partition sublist in advance in the existing scheme, and meanwhile, the database declarative partition table characteristics are retained.
The embodiment of the application provides a data writing method for realizing automatic partitioning, which comprises the following steps:
creating a declarative type partition main table, a data insertion rule, a data updating rule, an insertion function and an updating function in advance, binding the insertion function with the data insertion rule, and binding the updating function with the data updating rule;
when a data insertion statement is detected, calling an insertion function based on the data insertion rule, and completing the insertion of target data by executing the insertion function to replace the data insertion statement;
and when a data updating statement is detected, calling an updating function based on the data updating rule, and replacing the data updating statement by executing the updating function to complete the updating of the target data.
Further, the inserting of the target data is completed by executing the inserting function instead of the data inserting statement, including:
judging whether a sub-table of a partition corresponding to the target data exists or not;
if yes, inserting the target data into the inquired partition sub-table;
and if the target data does not exist, creating a partition sub-table, and inserting the target data into the created partition sub-table.
Further, when a data update statement is detected, an update function is called based on the data update rule, and the update of the target data is completed by executing the update function to replace the data update statement, including:
judging whether the updated field in the target data is a partition key;
if the updated field is not a partition key, updating the target data in the corresponding partition sub-table;
if the updated field is a partition key, judging whether a partition sublist corresponding to the target data exists or not;
if yes, deleting the target data in an original partition sub-table, and inserting the target data into the inquired partition sub-table;
and if the target data does not exist, deleting the target data in the original partition sub-table, creating a partition sub-table, and inserting the target data into the created partition sub-table.
Further, the determining whether the partition sub-table corresponding to the target data exists includes:
determining the name of the corresponding sub-table of the partition according to the content of the target data and a preset rule;
and inquiring whether the partition sub-table exists in a database through the name of the partition sub-table.
Further, according to the content of the target data, determining the name of the corresponding partition sublist by a preset rule, including:
determining a time range interval of the target data according to a preset time interval according to the timestamp of the target data;
determining a suffix name of the partition sublist and start time and end time of the partition range according to the time range interval;
and generating the name of the sub-table of the partition corresponding to the target data according to the name of the main table of the partition, the suffix name of the sub-table of the partition and the starting time and the ending time of the range of the partition.
Further, querying whether the partition sub-table exists in a database through the name of the partition sub-table includes:
and performing association query in the pg _ entries table and the pg _ class table of the database according to the name of the sub-table of the partition, and judging whether the sub-table of the partition exists according to a query result.
Further, inserting the target data in the created or queried partition sub-table includes:
inquiring data table fields according to the partition main table, and filtering partition keys in the data table fields to generate field templates;
inserting the target data into a corresponding field of the partition sub-table based on a field template;
updating the target data in the corresponding partition sub-table, including:
inquiring data table fields according to the partition main table, and filtering partition keys in the data table fields to generate field templates;
updating the target data in corresponding fields of the partition sub-table based on a field template.
Further, the method further comprises:
extracting the partition key definition of the partition main table through a pg _ get _ partkeydef built-in function, and filtering and reserving the partition key in the partition key definition through the strpos function and the substr function.
The embodiment of the application also provides a data writing device for realizing automatic partitioning, wherein the device comprises a memory for storing computer program instructions and a processor for executing the computer program instructions, and when the computer program instructions are executed by the processor, the device is triggered to execute the data writing method for realizing automatic partitioning.
Still other embodiments of the present application provide a computer-readable medium having stored thereon computer program instructions executable by a processor to implement the data writing method for implementing auto-partitioning.
In the data writing scheme for realizing automatic partitioning, a declarative partition main table, a data insertion rule, a data updating rule, an insertion function and an updating function can be created in advance, the insertion function is bound with the data insertion rule, the updating function is bound with the data updating rule, when a data insertion statement is detected, the insertion function is called based on the data insertion rule, the insertion function is used for replacing the data insertion statement to complete the insertion of target data, when a data updating statement is detected, the updating function is called based on the data updating rule, and the updating function is used for replacing the data updating statement to complete the updating of the target data. Thus, while maintaining the characteristic of the database statement type partition table, the flow of data writing can be changed by using the rule, the execution sequence limitation of the partition table trigger is avoided, and the data writing can be smoothly completed even under the condition that the partition sublist is not created in advance.
Drawings
Other features, objects and advantages of the present application will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings in which:
FIG. 1 is a schematic diagram of a conventional data insertion/update flow of a declarative partition table of a PostgreSQL database;
fig. 2 is a schematic step diagram of a configuration process of a database when a data writing method provided in an embodiment of the present application is implemented;
the same or similar reference numbers in the drawings identify the same or similar elements.
Detailed Description
The present application is described in further detail below with reference to the attached figures.
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may also be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or wirelessly coupled. As used herein, the term "and/or" includes all or any element and all combinations of one or more of the associated listed items.
In a typical configuration of the present application, the terminal, the devices serving the network each include one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
Computer-readable media include permanent and non-permanent, removable and non-removable media and may implement information storage by any method or technology. The information may be computer program instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device.
For the declarative partition table of the PostgreSQL database, all inserted rows will be routed into the partition based on the values of the partition keys. Each partition has a subset of data defined by its partition boundaries. The partitioning methods currently supported are ranges, lists, and hashes. I.e. the partition table insertion/update operation, has a built-in routing rule. By definition, in conjunction with the insertion/update operation on a table and the trigger flow, a piece of data, when inserted into a partition table, will go through the steps shown in fig. 1:
step 1, triggering a preposed inserting trigger of a partition main table.
And 2, automatically routing to the partition sub-table meeting the range.
And 3, triggering a preposed inserting trigger of the partition sub-table meeting the range.
Step 4, inserting data
And 5, triggering a postposition inserting trigger of the partition sub-table meeting the range.
And 6, triggering a post-insertion trigger of the partition main table.
The main table of the partition table can only trigger the statement level trigger, the sub-table of the partition table can only trigger the issue level trigger, and the database can automatically perform routing rule processing on the insertion and updating actions.
The whole inserting process is in one transaction, and the whole partition table information is collected during the beginning period of inserting; the declarative partition table has a built-in routing behavior that occurs after a pre-insertion trigger at the statement level. It will direct the actual insert (insert)/update (update) action to the partition sub-table for insert and update based on the partition column. At this time, the partition main table does not trigger its row-level trigger, but triggers the row-level triggers in the partition sub-table, and does not trigger the statement-level triggers of the partition sub-table. Therefore, in this processing procedure, if the inserted or updated data needs to be written into the partition sub-table that has not been created yet, a new partition cannot be dynamically created at the statement level, and the automatic routing stage is reached, and no subsequent action is performed, thereby causing the insertion or update to fail.
Therefore, the embodiment of the application provides a data writing method for realizing automatic partitioning, which can create an declarative partition main table, a data insertion rule, a data updating rule, an insertion function and an updating function in advance, bind the insertion function with the data insertion rule, bind the updating function with the data updating rule, call the insertion function based on the data insertion rule when a data insertion statement is detected, complete the insertion of target data by executing the insertion function to replace the data insertion statement, and call the updating function based on the data updating rule when a data updating statement is detected, and complete the updating of the target data by executing the updating function to replace the data updating statement. Thus, while maintaining the characteristic of the database statement type partition table, the flow of data writing can be changed by using the rule, the execution sequence limitation of the partition table trigger is avoided, and the data writing can be smoothly completed even under the condition that the partition sublist is not created in advance.
In an actual scenario, the execution subject of the method may be a user equipment, a network device, or a device formed by integrating the user equipment and the network device through a network, and may also be a program running in the above device. The user equipment comprises but is not limited to various terminal equipment such as a computer, a mobile phone and a tablet computer; including but not limited to implementations such as a network host, a single network server, multiple sets of network servers, or a cloud-computing-based collection of computers. Here, the Cloud is made up of a large number of hosts or web servers based on Cloud Computing (Cloud Computing), which is a type of distributed Computing, one virtual computer consisting of a collection of loosely coupled computers.
Taking a PostgreSQL database as an example, an declarative partition master table is created, so that the relevant characteristics of PostgreSQL declarative partitions are used in the scheme of the application, and automatic indexing, intelligent partition connection and the like are performed on the partitions. And the data insertion rule and the data updating rule utilize a rule system of PostgreSQL, after the insertion function is bound with the data insertion rule and the updating function is bound with the data updating rule, the execution action of the related function can replace the conventional insertion and updating action in a PostgreSQL database through a query rewriting technology, and automatic table partitioning is realized during data writing.
Upon detection of a data insertion statement (insert), an insertion function may be called based on the data insertion rule, and insertion of target data is completed by executing the insertion function in place of the data insertion statement. The specific process can comprise the following steps:
first, whether a partition sub-table corresponding to target data exists is judged. Wherein, the target data is the data to be inserted at this time. The specific way of the judgment may be to determine the name of the corresponding partition sub-table according to the content of the target data by using a preset rule, then query whether the partition sub-table exists in the database according to the name of the partition sub-table, if the partition sub-table exists, it indicates that the partition sub-table exists, otherwise, if the partition sub-table does not exist, it indicates that the partition sub-table does not exist.
Where the boundary range of a partition is time-dependent, the name of the partition sub-table may be named in relation to a timestamp in the contents of the target data. Therefore, when the name of the corresponding partition sub-table is determined according to the content of the target data and the preset rule, the time range interval of the target data is determined according to the time stamp of the target data and the preset time interval, then the suffix name of the partition sub-table and the start time and the end time of the partition range are determined according to the time range interval, and then the name of the partition sub-table corresponding to the target data is generated according to the name of the partition main table, the suffix name of the partition sub-table and the start time and the end time of the partition range.
The preset time interval may be year, month, week, day, etc., so that the corresponding specific time range interval may be determined according to the timestamp of the target data. Taking week as an example, if the timestamp of the target data is 26/1/2022, the corresponding time range interval can be determined to be 4/2022, and thus the suffix name of the partition sub-table can be determined to be 202204. If the time range included in the 4 th week of 2022 is from 22/1/20200: 00 in 2022 to 28/23: 59 in 2022, the start time of the partition range is determined to be 00:00 in 22/1/2022 and the end time is determined to be 23:59 in 28/1/2022, which are respectively recorded as 202201220000 and 202201282359. After determining the suffix name and the start time and the end time of the partition range, further combining the names of the partition main tables, and thus generating the names of the partition sub tables corresponding to the target data according to the names of the partition main tables, the suffix name of the partition sub table and the start time and the end time of the partition range. For example, if the name of the partition main table is table, the name of the partition sub table into which the target data with the timestamp of 26/1/2022 should be inserted is table.202204(202201220000 and 202201282359).
In an actual scenario, a database partition range type may be created in advance, and the type may include specific data structures such as a suffix name of a partition sub-table, a start time and an end time of a partition range, so that when the name of the partition sub-table is determined, the name of the partition sub-table may be generated in a normalized manner by using the data structure of the database partition range type, so that it may be more convenient and accurate to query whether the partition sub-table exists or not in a subsequent process.
When querying whether the partition sub-table exists in the database through the name of the partition sub-table, performing association query in the pg _ entries table and the pg _ class table of the database according to the name of the partition sub-table, and judging whether the partition sub-table exists according to a query result.
And if so, inserting the target data into the inquired partition sub-table.
And if the target data does not exist, creating a partition sub-table, and inserting the target data into the created partition sub-table. The creation of the partition sub-table can be realized by calling a pre-configured corresponding function, for example, in the embodiment of the present application, a new partition main function is called, and an "IF NOT EXISTS" condition can be added in the process of creating the partition sub-table, so that the repeated creation of the partition sub-table is avoided.
And when a data update statement (update) is detected, an update function may be called based on the data update rule, and the update of the target data may be completed by executing the update function instead of the data update statement. When the updated field is not the partition key, that is, the value of the partition key of the target data is the same as the original value in the partition sub-table, at this time, it is not necessary to execute a delete command and create a new partition sub-table, and it is only necessary to directly update the relevant field in the partition sub-table corresponding to the target data. When the updated field is a partition key, i.e. the value of the partition key of the target data is different from the original value in the partition sub-table, a new partition sub-table may need to be created. Therefore, whether the corresponding partition sub table exists or not needs to be searched, if the corresponding partition sub table does not exist, the partition sub table needs to be created, old data in the original partition sub table is deleted, a new record is inserted into the new partition sub table, and partition movement is achieved.
The specific process can comprise the following steps: first, it is determined whether the updated field in the target data is a partition key. And if the updated field is not the partition key, updating the target data in the corresponding partition sub-table. And if the updated field is a partition key, continuously judging whether a partition sublist corresponding to the target data exists. When judging whether the partition sub-table corresponding to the target data exists, the same manner as that of data insertion can be adopted, namely, the name of the partition sub-table corresponding to the target data is determined according to the content of the target data by a preset rule, and then whether the partition sub-table exists is inquired in a database through the name of the partition sub-table.
If the sub-table of the partition corresponding to the target data exists, it indicates that a new sub-table of the partition does not need to be created, at this time, the target data can be deleted from the original sub-table of the partition, and the target data is inserted into the inquired sub-table of the partition;
if the partition sub-table corresponding to the target data does not exist, it indicates that the update of the target data can be completed only after a new partition sub-table needs to be created, and at this time, the partition sub-table may be created after the target data is deleted from the original partition sub-table, and the target data is inserted into the created partition sub-table.
When the target data is inserted into the created or queried partition sublist, a data table field can be queried according to a partition main table, partition keys in the data table field are filtered, a field template is generated, and then the target data is inserted into a corresponding field of the partition sublist based on the field template. Therefore, the target data with the correct format can be ensured to be inserted, and errors in the insertion process are avoided.
Similarly, when the target data is updated in the corresponding partition sublist, a similar manner may also be adopted, that is, a data table field is queried according to the partition main table, a partition key in the data table field is filtered, a field template is generated, and then the target data is updated in the corresponding field of the partition sublist based on the field template, thereby avoiding an error generated in the updating process.
Before filtering the partition key in the data table field, the partition key definition of the partition main table can be extracted through a pg _ get _ partkeydef built-in function, and the partition key in the partition key definition is filtered and reserved through a strpos function and a substr function, so that the partition key is accurately and quickly acquired.
In an actual scene, the database can be configured correspondingly in advance, so that the automatic creation of the partition sub-table can be realized in the data writing process on the premise of keeping the characteristic of the declarative partition. Fig. 2 is a schematic step diagram of a configuration process, and the following steps may be adopted to complete corresponding configuration:
step one, creating a database partition range type. This type contains the suffix name of the database partition sub-table and the start time and end time of the partition range for subsequent use in generating the partition sub-table name.
And step two, creating a table partition function. The table partitioning function is composed of a series of functions, and the implementation step of the table partitioning function comprises the following sub-steps:
and a first substep, a partition table name convention function.
The partition sub-table name convention function is used for generating names of sub-tables according to rules. The generation rules may be partitioned by day, by week, by month, by year, etc. The function transmits a time stamp of data to be inserted, calculates a time range interval to which the transmitted time stamp belongs according to day, week, month and year intervals, and generates a sub-table suffix name, the start time and the end time of a partition range according to the time range interval.
Substep two, obtain the name function of partition of the partition table
The function performs association query through a pg _ entries table in a system database (pg _ catalog) and a pg _ class table in the system database (pg _ catalog), and judges whether a partition sub-table exists. If no partition sub-table exists, returning a result NULL.
A third substep of obtaining a partition key function of the partition table
Extracting the partition key definition of the partition table through a pg _ get _ partkeydef built-in function, and filtering and reserving a partition column in the partition key definition through a strpos function and a substr function, wherein the partition column is the partition key.
Step four, acquiring a partition table updating column template function
The function queries the fields on the main table through the transmitted main table of the partition table, and filters the partition keys by using the function in the third substep to generate an updated field template.
Step five, creating a new partition main function
This function is used to create a partition sub-table. The function input is the name of the main table of the partition table, and an 'IF NOT EXISTS' condition needs to be added in the process of creating the sub-table, so that repeated creation is avoided.
Substep six, creating an insertion function
Intercepting the inserted target data, calling the substep, inquiring whether a partition sublist corresponding to the timestamp of the inserted data exists, and if not, inserting the data into the created partition sublist.
Seventhly, an update function is created
Intercepting updated target data, calling the substep, and directly updating fields in the partition sublist without executing a deletion command or creating a new and nonexistent partition sublist for the same update of the partition keys. For the data with updated partition keys, whether a new partition sub-table exists is searched, if the new partition sub-table does not exist, creation is needed, old data on the original partition sub-table is deleted, and new data is inserted into the new partition sub-table, so that partition movement is realized.
Step three, a main table of the statement type partition table is created, and any sub table including a default table is not created;
and step four, creating a data insertion rule, binding the insertion function created in the substep six to the data insertion rule, and replacing default insertion processing of the database.
And step four, creating a data updating rule, binding the updating function created in the substep seven to the data updating rule, and replacing the default updating processing of the database.
When the scheme is applied to PostgreSQL data, the creation of a new partition sub-table can be automatically created according to inserted or updated target data as required instead of being created manually, and a continuous partition sub-table does not need to be automatically created until new data is inserted, so that whether the new partition sub-table needs to be created or not is determined. Meanwhile, when data is inserted or updated into the sub-table of the partition, the query condition is not required to be changed, and the data can still be inserted or updated into the main table of the partition. In addition, the characteristics of the PostgreSQL database declarative partition table are retained, and the functions of partition automatic indexing, intelligent partition connection, intelligent partition aggregation, cross-partition movement, partition cutting enhancement and the like can be realized. Automatic routing can also be achieved, and changes to the partition logic can take effect immediately without relying on the database to upgrade itself.
Based on the same inventive concept, the embodiment of the present application further provides a data writing device for implementing automatic partitioning, the corresponding method of the device is the data writing method for implementing automatic partitioning in the foregoing embodiment, and the principle of solving the problem is similar to that of the method. The data migration device for heterogeneous databases provided by the embodiments of the present application includes a memory for storing computer program instructions and a processor for executing the computer program instructions, where the computer program instructions, when executed by the processor, trigger the device to implement the methods and/or technical solutions of the foregoing embodiments of the present application.
In particular, the methods and/or embodiments in the embodiments of the present application may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. The computer program, when executed by a processing unit, performs the above-described functions defined in the method of the present application.
It should be noted that the computer readable medium described herein can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present application, a computer readable medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
In this application, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present application may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
The flowchart or block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
As another aspect, the present application also provides a computer-readable medium, which may be contained in the apparatus described in the above embodiments; or may be separate and not incorporated into the device. The computer-readable medium carries one or more computer program instructions that are executable by a processor to implement the methods and/or aspects of the embodiments of the present application as described above.
It should be noted that the present application may be implemented in software and/or a combination of software and hardware, for example, implemented using Application Specific Integrated Circuits (ASICs), general purpose computers or any other similar hardware devices. In some embodiments, the software programs of the present application may be executed by a processor to implement the above steps or functions. Likewise, the software programs (including associated data structures) of the present application may be stored in a computer readable recording medium, such as RAM memory, magnetic or optical drive or diskette and the like. Additionally, some of the steps or functions of the present application may be implemented in hardware, for example, as circuitry that cooperates with the processor to perform various steps or functions.
It will be evident to those skilled in the art that the application is not limited to the details of the foregoing illustrative embodiments, and that the present application may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the application being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned. Furthermore, it is obvious that the word "comprising" does not exclude other elements or steps, and the singular does not exclude the plural. A plurality of units or means recited in the apparatus claims may also be implemented by one unit or means in software or hardware. The terms first, second, etc. are used to denote names, but not any particular order. The numerical sequence of the sequence numbers corresponding to the steps does not indicate any specific execution sequence, and the steps can be executed in any sequence combination on the premise of conforming to the execution logic.

Claims (10)

1. A data writing method for realizing automatic partition is characterized by comprising the following steps:
creating a declarative type partition main table, a data insertion rule, a data updating rule, an insertion function and an updating function in advance, binding the insertion function with the data insertion rule, and binding the updating function with the data updating rule;
when a data insertion statement is detected, calling an insertion function based on the data insertion rule, and completing the insertion of target data by executing the insertion function to replace the data insertion statement;
and when a data updating statement is detected, calling an updating function based on the data updating rule, and replacing the data updating statement by executing the updating function to complete the updating of the target data.
2. The method of claim 1, wherein performing the insertion of the target data by executing the insertion function in place of the data insertion statement comprises:
judging whether a sub-table of a partition corresponding to the target data exists or not;
if yes, inserting the target data into the inquired partition sub-table;
and if the target data does not exist, creating a partition sub-table, and inserting the target data into the created partition sub-table.
3. The method of claim 1, wherein upon detecting a data update statement, invoking an update function based on the data update rule, and completing the update of the target data by executing the update function in place of the data update statement comprises:
judging whether the updated field in the target data is a partition key;
if the updated field is not a partition key, updating the target data in the corresponding partition sub-table;
if the updated field is a partition key, judging whether a partition sublist corresponding to the target data exists or not;
if yes, deleting the target data in an original partition sub-table, and inserting the target data into the inquired partition sub-table;
and if the target data does not exist, deleting the target data in the original partition sub-table, creating a partition sub-table, and inserting the target data into the created partition sub-table.
4. The method according to claim 2 or 3, wherein determining whether the partition sub-table corresponding to the target data exists comprises:
determining the name of the corresponding sub-partition table according to the content of the target data and a preset rule;
and inquiring whether the partition sub-table exists in a database through the name of the partition sub-table.
5. The method of claim 4, wherein determining the name of the partition sub-table corresponding to the target data according to the content of the target data by using a preset rule comprises:
determining a time range interval of the target data according to a preset time interval according to the timestamp of the target data;
determining a suffix name of the partition sub-table and the starting time and the ending time of the partition range according to the time range interval;
and generating the name of the sub-table of the partition corresponding to the target data according to the name of the main table of the partition, the suffix name of the sub-table of the partition and the starting time and the ending time of the range of the partition.
6. The method of claim 4, wherein querying a database for the existence of the partition sub-table by the name of the partition sub-table comprises:
and performing association query in the pg _ entries table and the pg _ class table of the database according to the name of the sub-table of the partition, and judging whether the sub-table of the partition exists according to a query result.
7. The method of claim 2 or 3, wherein inserting the target data in the created or queried partition sub-table comprises:
inquiring data table fields according to the partition main table, and filtering partition keys in the data table fields to generate field templates;
inserting the target data into a corresponding field of the partition sub-table based on a field template;
updating the target data in the corresponding partition sub-table, including:
inquiring data table fields according to the partition main table, and filtering partition keys in the data table fields to generate field templates;
updating the target data in corresponding fields of the partition sub-table based on a field template.
8. The method of claim 6, further comprising:
extracting the partition key definition of the partition main table through a pg _ get _ partkeydef built-in function, and filtering and reserving the partition key in the partition key definition through the strpos function and the substr function.
9. A data writing apparatus implementing automatic partitioning, wherein the apparatus comprises a memory for storing computer program instructions and a processor for executing the computer program instructions, wherein the computer program instructions, when executed by the processor, trigger the apparatus to perform the method of any one of claims 1 to 8.
10. A computer readable medium having stored thereon computer program instructions executable by a processor to implement the method of any one of claims 1 to 8.
CN202210736747.8A 2022-06-27 2022-06-27 Data writing method, device and computer readable medium for realizing automatic partition Pending CN115129717A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210736747.8A CN115129717A (en) 2022-06-27 2022-06-27 Data writing method, device and computer readable medium for realizing automatic partition

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210736747.8A CN115129717A (en) 2022-06-27 2022-06-27 Data writing method, device and computer readable medium for realizing automatic partition

Publications (1)

Publication Number Publication Date
CN115129717A true CN115129717A (en) 2022-09-30

Family

ID=83380663

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210736747.8A Pending CN115129717A (en) 2022-06-27 2022-06-27 Data writing method, device and computer readable medium for realizing automatic partition

Country Status (1)

Country Link
CN (1) CN115129717A (en)

Similar Documents

Publication Publication Date Title
JP7410181B2 (en) Hybrid indexing methods, systems, and programs
US8938430B2 (en) Intelligent data archiving
US9171027B2 (en) Managing a multi-version database
CN107766374B (en) Optimization method and system for storage and reading of massive small files
CN105205053A (en) Method and system for analyzing database incremental logs
US20210303537A1 (en) Log record identification using aggregated log indexes
US20150006485A1 (en) High Scalability Data Management Techniques for Representing, Editing, and Accessing Data
CN103377210A (en) Method for creating incremental navigation database and method for updating same
CN103217171A (en) Navigation equipment and method for editing map data by user
CN112487083A (en) Data verification method and equipment
US10423580B2 (en) Storage and compression of an aggregation file
US11175993B2 (en) Managing data storage system
CN112000971B (en) File permission recording method, system and related device
CN111176901B (en) HDFS deleted file recovery method, terminal device and storage medium
KR20120082176A (en) Data processing method of database management system and system thereof
CN115658391A (en) Backup recovery method of WAL mechanism based on QianBase MPP database
CN115129717A (en) Data writing method, device and computer readable medium for realizing automatic partition
US11556519B2 (en) Ensuring integrity of records in a not only structured query language database
CN113032408A (en) Data processing method, system and equipment
US20190384825A1 (en) Method and device for data protection and computer readable storage medium
US10521314B2 (en) Cross-referenced irregular field storage in databases
KR101086392B1 (en) An efficient recovery technique for large objects in write ahead logging
CN118035229A (en) Database sub-table method and device, electronic equipment and storage medium
CN111125262B (en) Method and device for processing field information, storage medium and processor
CN116701545A (en) Audit data analysis method, audit data analysis device, storage medium and audit data processor

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination