CN116010335A

CN116010335A - Data processing method and device applied to database and computer storage medium

Info

Publication number: CN116010335A
Application number: CN202211423390.4A
Authority: CN
Inventors: 幸福; 徐锐波; 卢文伟; 刘方
Original assignee: Beijing Yunsizhixue Technology Co ltd
Current assignee: Beijing Yunsizhixue Technology Co ltd
Priority date: 2022-11-15
Filing date: 2022-11-15
Publication date: 2023-04-25

Abstract

The invention discloses a data processing method applied to a database, which comprises the following steps: the data storage engine is used for switching the working database into a temporary database in the process of executing the data compression transaction/snapshot transaction aiming at the database, and is used for adapting to the data updating operation during the execution of the data compression transaction/snapshot transaction; and switching the working database into a compressed database/original database after the execution of the data compression transaction and/or the snapshot transaction is completed, wherein the compressed database is created by the data compression transaction, and the original database is a target database of the snapshot transaction. Aiming at the data security problem existing in the data compression/snapshot, the invention provides a new temporary database to distinguish the newly written data from the existing data, thereby ensuring safer and more reliable operation process; aiming at more complex snapshot atomization operation faced by multiple storage engine examples, the snapshot is faster to generate, the lock conflict is less likely to occur, and the data consistency of multiple storage engines is ensured.

Description

Data processing method and device applied to database and computer storage medium

Technical Field

The invention belongs to the technical field of database storage, and particularly relates to a data processing method and device applied to a database and a computer storage medium.

Background

When the database is made of the B+ tree, the database capacity is increased continuously with the continuous writing, but the database capacity is not reduced even if a large number of deletion operations are performed. Thus, data compression is generated, the data compression process is a data reconstruction process, a B+ database is newly generated, all data is scanned and written into the new database, so that the written useful data is ensured to be discarded.

The database is continuously updated when in operation. Consider a scenario where, while data compression is being done, an update operation for a certain key would still be written to an uncompressed b+ tree file. However, during compression, the original va-ue corresponding to the key may already be written into the new database, and then the uncompressed b+ tree file and the new b+ tree file are read, so that two different va-ues are read, that is, the situation that data is inconsistent in the data compression process occurs. This is the first technical problem to be solved by the present invention.

Looking at the snapshot, the database can carry out reading and writing without an instance, the backup of the database is a part of the database system, and the snapshot is a scheme for backing up the database, so that the working backup database can be quickly generated.

A snapshot is a full-volume backup process used to create a backup database. The full-volume backup process is divided into two steps, wherein the first step is to generate a full-volume snapshot based on a certain check point, and the full-volume snapshot contains all data before the check point; the second step is based on the incremental data after this checkpoint, which is the missing part of the backup database after the full snapshot is used, and the complete backup database needs to be replenished. Where snapshot backup is used, it is generally necessary to know to which point we backup, the data before this point needs to be copied in full and applied, and how the data after this point can be synchronized in communication with the database master node, this point is called a checkpoint.

When generating the snapshot, the database file of the b+ tree needs to be replicated. There are two ways to copy files: copy and hard link. The two modes are different in that: two files are copied, and the writing of each file is independent; the hard linking approach is to copy out a file node, not the file itself, but to write the same file, the writing of the source file may be embodied in a new file. However, the copy mode has a problem that assuming that the writing speed of the disk is 500MB/s and that it takes 100 seconds to copy a 50GB file, the time required to generate the snapshot is at least 100 seconds, and the time period to generate the snapshot is not acceptable to the storage system because the update of the data is blocked. Therefore, the copy file can only adopt hard links, and the hard links are references to the file, so that the same file has a plurality of file names and can be scattered in different catalogues. After modifying the same file at the same time, all files hard-linked therewith are modified together. Hard-linked replication files only produce a new file node, and do not make a copy of the data, which is almost negligible in time.

If hard linking is used for copying, writing to the database original data file, these writing operations will also be embodied in the new copy file. While transferring the snapshot to the backup database, the operation takes some time, during which time the data writes, if still written to the original database file, are visible on the copied file, due to the hard-linked copying of the transferred file, although it is a copied file. Such that updates in the duplicate file that are already based on after the checkpoint exist, rather than on the current checkpoint, are not allowed by the snapshot, and thus the duplicate file is actually an erroneous file. This is the second technical problem to be solved by the present invention. In view of this, the present invention is specifically proposed.

Disclosure of Invention

Aiming at the problems set forth above, the invention provides a data processing method, a data processing device and a computer storage medium applied to a database, which can solve the data security problem under data compression, generate snapshots in second level, ensure the consistency of data based on checkpoints by transmitting the snapshots, and ensure the consistency of snapshot data by a plurality of data storage engine examples. Specifically, the following technical scheme is adopted:

A data processing method applied to a database, comprising:

the data storage engine is used for switching the working database into a temporary database in the process of executing the data compression transaction/snapshot transaction aiming at the database, and is used for adapting to the data updating operation during the execution of the data compression transaction/snapshot transaction;

and switching the working database into a compressed database/original database after the execution of the data compression transaction and/or the snapshot transaction is completed, wherein the compressed database is created by the data compression transaction, and the original database is a target database of the snapshot transaction.

As an optional implementation manner of the invention, in the data processing method applied to the database, the single storage engine of the database is internally provided with the swap lock swapLock, and the storage engine performs the swap lock swapLock locking operation before executing the switching operation of the working database.

As an optional implementation manner of the present invention, in the data processing method applied to a database, a task lock is provided in a single storage engine of the database, and the execution flow of the data compression transaction includes:

the data storage engine performs task lock operation to switch the working database into a temporary database;

Executing data compression operation on the current database, and taking the compressed database created by data compression as the current database after the data compression operation is completed;

and switching the working database to be the current database, and unlocking the task lock.

As an optional embodiment of the present invention, in the data processing method applied to a database of the present invention, the switching the working database to the temporary database includes:

the data storage engine performs a swap lock swapLock locking operation, switches a working database into a temporary database, and unlocks the swap lock swapLock after the switching is completed;

the switching the working database to the current database comprises the following steps:

and the data storage engine performs a swap lock swapLock locking operation, switches the working database into the current database, and unlocks the swap lock swapLock after the switching is completed.

As an optional implementation manner of the invention, in the data processing method applied to the database, a single data storage engine of the database is internally provided with an attempt to acquire a lock tryLock, and the execution flow of the snapshot transaction comprises the following steps:

the data storage engine calls the attempt to acquire a lock tryLock to attempt locking aiming at a task lock;

Judging whether the temporary database contains data or not, if yes, unlocking to acquire a lock tryLock and returning to failure, and if no, performing task lock locking operation;

calling an attempt to acquire a lock tryLock to attempt locking for a swap lock swapLock, and switching a working database into a temporary database after the swap lock swapLock is successfully locked;

and executing snapshot operation aiming at the current database, and unlocking a task lock after the snapshot is completed.

As an optional implementation manner of the invention, in the data processing method applied to the database, the calling to attempt to acquire the lock tryLock to attempt to lock the exchange lock swapLock comprises the following steps:

calling an attempt to acquire a lock tryLock, and acquiring the current state of a swap lock swapLock;

if the exchange lock swapLock is currently in a locking state, attempting to acquire the lock try lock failure and returning an error, and if the exchange lock swapLock is currently in an unlocking state, locking the exchange lock swapLock;

and after the attempt to acquire the lock tryLock fails to attempt locking, performing the attempt locking operation again at regular intervals until the attempt locking object is successfully locked or the snapshot operation is cancelled.

As an optional embodiment of the present invention, in the data processing method applied to a database, when a single database includes a plurality of data storage engine instances, a process of generating a snapshot of the database based on all the data storage engine instances includes:

each data storage engine instance respectively and simultaneously executes a snapshot preparation flow;

when all the data storage engine examples successfully execute the snapshot preparation flow, generating a snapshot by using a common check point of all the data storage engine examples, and when any one of the data storage engine examples fails to execute the snapshot preparation flow, returning to execute the snapshot failure;

the snapshot preparation flow comprises the following steps: and the data storage engine instance calls to attempt to acquire the lock tryLock to lock the task lock and the exchange lock swapLock, if the task lock and the exchange lock swapLock are successfully locked, the snapshot preparation flow is successful, and otherwise, the snapshot preparation flow is failed.

As an optional implementation manner of the invention, in the data processing method applied to the database, when the execution of the snapshot preparation flow by a single data storage engine instance fails, a preparation failure return instruction is reported;

The data processing method comprises the following steps:

each data storage engine example checks the result of executing the snapshot preparation flow step by step in sequence;

when all the data storage engine examples successfully execute the snapshot preparation flow, generating a snapshot by using a common check point of all the data storage engine examples;

when the snapshot preparation flow fails to be executed by a certain data storage engine instance, a preparation failure return instruction is reported, the task lock and the exchange lock swapLock are unlocked after the previous data storage engine instance receives the preparation failure return instruction, the state of the snapshot preparation flow is changed into the snapshot preparation flow failure, the preparation failure return instruction is reported to the next previous data storage engine instance, and the steps are sequentially executed until the task lock and the exchange lock swapLock are unlocked by all the data storage engine instances with successful snapshot preparation flow.

As an optional embodiment of the present invention, the data processing method applied to a database of the present invention includes:

after the request data in the memory cache area is monitored to reach a storage threshold value, the data storage engine executes a data updating operation of updating the request data in the memory cache area to the current working database;

The data storage engine performs a swap lock locking operation to acquire a current working database;

and writing the request data in the memory cache area into the current working database in batches, and unlocking the exchange lock swapLock after the writing is completed.

As an optional implementation manner of the invention, in the data processing method applied to the database, after the execution of the data compression transaction and/or the snapshot transaction is completed, the request data written by the data updating operation in the temporary database is written back to the current working database.

The invention also provides a data processing device applied to the database, which comprises:

the database switching module is used for switching the working database into a temporary database in the process of executing the data compression transaction/snapshot transaction on the database by the data storage engine and is used for adapting to the data updating operation during the execution of the data compression transaction/snapshot transaction; and when the execution of the data compression transaction and/or the snapshot transaction is completed, the database switching module switches the working database into a compression database/original database, wherein the compression database is created by the data compression transaction, and the original database is a target database of the snapshot transaction.

The invention also provides an electronic device comprising a processor and a memory for storing a computer executable program, which when executed by the processor performs the data processing method applied to a database.

The invention also provides a computer readable storage medium storing a computer executable program, which when executed, implements the data processing method applied to the database.

Compared with the prior art, the invention has the beneficial effects that:

the data processing method applied to the database, disclosed by the invention, is used for carrying out data updating operation during the execution of the data compression transaction/snapshot transaction by introducing the temporary database dbBuffer in the process of executing the data compression transaction on the database by the data storage engine. Thus, when data compression is performed, an update operation for a certain key is written into the temporary database file, and a new update operation is only applied to the temporary database. At the same time, a database file for compression is created, and the original database is kept in a read-only state. In the data compression process, the data are sequentially read from the original database and written into the compressed database, the compressed database can be effective after the compression operation is completed, and the original database can be deleted, so that the data compression process is not influenced by the update operation. In addition, the read operation during compression prioritizes reading the temporary database because the values in the temporary database are always newer than the values of the original database. According to the data processing method applied to the database, in the process that the data storage engine executes snapshot business aiming at the database, the snapshot adopts a hard-link copying mode to realize second-level snapshot generation. In the process of snapshot generation, a data update operation during the execution of a data compression transaction/snapshot transaction is carried out by introducing a temporary database dbBuffer. When the snapshot is generated, the data is updated and written into the temporary database file, the original database is not updated, the transmitted copy file is always consistent with the original database file based on the data after checking points, and the update data after checking nodes cannot appear in the copy file, so that the snapshot file is ensured to be a correct file.

Therefore, the data processing method applied to the database solves the data security problem of snapshot generation/transmission under large data volume, ensures that snapshot generation time can be completed in second level, ensures that data compression is a safe behavior, and cannot cause data abnormality by new updating operation. The data processing method applied to the database, aiming at the data security problem commonly existing in data compression and snapshot, provides a new temporary database dbBuffer to distinguish newly written data from existing data, and ensures that the two operations of data compression and snapshot are safer and more reliable.

According to the data processing method applied to the database, aiming at more complex snapshot atomization operation which is faced by a multi-storage engine example, hard links are used for accelerating the atomization process, locks are added to be matched with a temporary database dbBuffer and the like for use, so that the snapshot is generated more quickly, lock conflicts are less likely to occur, and the data consistency of a plurality of storage engines is ensured; and additionally, the storage engine instance based on the failure of the snapshot preparation flow is rolled back necessarily, so that the safe execution of atomization is ensured.

Description of the drawings:

FIG. 1 is a flow chart of a data processing method applied to a database according to an embodiment of the present invention;

FIG. 2 shows a flow and a comparison of updating operation, data compression and snapshot performed in a data processing method applied to a database according to an embodiment of the present invention;

FIG. 3 is a flow chart of a snapshot preparation flow of a single data storage engine in a data processing method applied to a database according to an embodiment of the present invention;

FIG. 4 is a flow chart of a preparation rollback procedure of a single data storage engine in a data processing method applied to a database according to an embodiment of the present invention;

FIG. 5 is a snapshot flow diagram of an example of a multiple data storage engine in a data processing method applied to a database according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention more clear, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. It will be apparent that the described embodiments are some, but not all, embodiments of the invention.

Thus, the following detailed description of the embodiments of the invention is not intended to limit the scope of the invention, as claimed, but is merely representative of some embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

It should be noted that, under the condition of no conflict, the embodiments of the present invention and the features and technical solutions in the embodiments may be combined with each other.

It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further definition or explanation thereof is necessary in the following figures.

In the description of the present invention, it should be noted that, the terms "upper", "lower", and the like indicate an azimuth or a positional relationship based on the azimuth or the positional relationship shown in the drawings, or an azimuth or a positional relationship conventionally put in use of the inventive product, or an azimuth or a positional relationship conventionally understood by those skilled in the art, such terms are merely for convenience of describing the present invention and simplifying the description, and do not indicate or imply that the apparatus or element to be referred must have a specific azimuth, be constructed and operated in a specific azimuth, and thus should not be construed as limiting the present invention. Furthermore, the terms "first," "second," and the like, are used merely to distinguish between descriptions and should not be construed as indicating or implying relative importance.

Example 1

Referring to fig. 1, the data processing method applied to a database of the present embodiment includes:

In the process of executing data compression transaction/snapshot transaction on the database, the data storage engine switches the working database into a temporary database dbBuffer for accepting data updating operation during the execution of the data compression transaction/snapshot transaction;

In the data processing method applied to the database of the embodiment, in the process of executing the data compression transaction on the database by the data storage engine, the temporary database dbBuffer is introduced to carry out the data updating operation during the execution of the data compression transaction/snapshot transaction. Thus, when data compression is performed, an update operation for a certain key is written into the temporary database file, and a new update operation is only applied to the temporary database. At the same time, a database file for compression is created, and the original database is kept in a read-only state. In the data compression process, the data are sequentially read from the original database and written into the compressed database, the compressed database can be effective after the compression operation is completed, and the original database can be deleted, so that the data compression process is not influenced by the update operation. In addition, the read operation during compression prioritizes reading the temporary database because the values in the temporary database are always newer than the values of the original database.

In the data processing method applied to the database of the embodiment, in the process that the data storage engine executes snapshot transactions aiming at the database, the snapshot adopts a hard-link replication mode to realize second-level snapshot generation. In the process of snapshot generation, a data update operation during the execution of a data compression transaction/snapshot transaction is carried out by introducing a temporary database dbBuffer. When the snapshot is generated, the data is updated and written into the temporary database file, the original database is not updated, the transmitted copy file is always consistent with the original database file based on the data after checking points, and the update data after checking nodes cannot appear in the copy file, so that the snapshot file is ensured to be a correct file.

Therefore, the data processing method applied to the database solves the data security problem of snapshot generation/transmission under large data volume, ensures that snapshot generation time can be completed in second level, ensures that data compression is a safe behavior, and cannot cause data abnormality by new updating operation. The data processing method applied to the database in the embodiment aims at the data security problem existing in the data compression and snapshot, and establishes a temporary database dbBuffer to distinguish newly written data from existing data, so that the data compression and snapshot operations are safer and more reliable.

Since the data processing method applied to the database of the present embodiment needs to switch the working database when executing the data compression transaction and/or the snapshot transaction, in order to ensure the stability and reliability of the data storage engine in the database switching process, as an optional implementation manner of the present embodiment, in the data processing method applied to the database of the present embodiment, a single storage engine of the database is provided with a swap lock swapLock, and the storage engine performs the swap lock swapLock locking operation before executing the switching operation of the working database. The data storage engine of the embodiment realizes the database switching process through the cooperation of the reference exchange lock swapLock, thereby ensuring the safety and reliability of the database switching process.

Specifically, when the storage engine adds a swap lock swapLock to the current working database, the swap lock swapLock locks the database switching operation, other operations related to the database switching are not executed, the swap lock swapLock specifically locks the current working database, and operations such as writing of new updated data, reading of existing data and the like are not performed during the database switching.

As an optional implementation manner of the present embodiment, in the data processing method applied to the database of the present embodiment, a task lock is provided in a single storage engine of the database, as shown in fig. 2, an execution flow of the data compression transaction in the present embodiment includes:

The data storage engine performs task lock operation to switch the working database into a temporary database dbBuffer;

The single data storage engine of the embodiment adopts the task lock taskLock, the exchange lock swapLock and the temporary database dbBuffer to cooperate, so that the problem of data abnormality caused by data updating operation in the data compression process is solved.

Specifically, in the data compression process, after the task lock taskLock is added to the current working database by the storage engine, the data compression flow is locked, other data processing flows related to the data compression flow are not allowed to be executed, and specifically, the current working database is locked, and other operations except the data compression operation are not allowed to be executed.

Further, in the data processing method applied to a database according to the present embodiment, in a data compression process, switching a working database to a temporary database includes: the data storage engine performs a swap lock swapLock locking operation, switches a working database into a temporary database, and unlocks the swap lock swapLock after the switching is completed;

The step of converting the switching work database into the current database comprises the following steps: and the data storage engine performs a swap lock swapLock locking operation, switches the working database into the current database, and unlocks the swap lock swapLock after the switching is completed.

As an optional implementation manner of the present embodiment, in the data processing method applied to a database of the present embodiment, a single data storage engine of the database has an attempt to acquire a lock tryLock, and as shown in fig. 2, an execution flow of a snapshot transaction in the present embodiment includes:

judging whether the temporary database dbBuffer contains data or not, if the judging result is yes, unlocking to acquire a lock tryLock and returning to failure, and if the judging result is no, performing task lock locking operation (because the data of the temporary database dbBuffer is temporary data, the data needs to be written into a non-temporary working database to be complete);

The data processing method applied to the database in the embodiment adopts the hard link mode for the scheme of generating the snapshot, thus ensuring that the snapshot is generated much faster than the copy, and simultaneously using the temporary database dbBuffer to make buffering and receiving writing operation, so that the original database file is not modified in the whole snapshot generating and transmitting process, and the data is complete and safe. Meanwhile, based on the call of attempting to acquire the lock tryLock, the difference between tryLock and lock is that if the lock is not locked, the lock is immediately locked, and if other tasks have already locked the lock, an error is returned, so that the attempt to acquire the lock tryLock can quickly know whether the lock is locked or not for the lock to be used, rather than waiting for acquiring the lock, the snapshot is generated to obtain a result more quickly, and the fact that the reading and writing of the whole database are affected as little as possible is ensured.

Since the swap lock swapLock locks the entire update operation, this time is typically in seconds, and it may be extremely time consuming to generate a snapshot at the extreme, where lock is not available to replace trilock.

Further, in the data processing method applied to the database of this embodiment, the attempting to acquire the lock tryLock to lock the swap lock swapLock includes:

In the data processing method applied to the database of the embodiment, the data updating operation process is a batch refreshing process, the data storage engine writes the request into the memory buffer area first, and the data storage engine refreshes the memory buffer area into the B+ tree in batches after the memory buffer area is fully written. Therefore, the data processing method applied to the database of the present embodiment includes:

after the request data in the memory buffer area is monitored to reach the storage threshold value, the data storage engine executes a data updating operation of updating the request data in the memory buffer area to a current working database (possibly an original database or a temporary database dbBuffer);

referring to fig. 2, the data storage engine performs a swap lock swapLock locking operation to obtain a current working database;

In the data processing method applied to the database in this embodiment, in the snapshot/data compression flow, a temporary dbBuffer is generated, and after the execution of the data compression transaction and/or the snapshot transaction is completed, the request data written by the data update operation in the temporary database is written back to the current working database. The process is completed asynchronously, and before the write-back is completed, the read-write operation needs to check the data of the temporary dbBuffer.

Example two

If a single database contains multiple storage engine instances, then data compression is non-conflicting. However, when generating snapshots based on multiple storage engine instances, the process coordinates each engine instance, locking all storage engine instances to ensure that the snapshots are generated based on a common checkpoint. Thus, the data processing method applied to a database of the present embodiment, when a single database contains a plurality of data storage engine instances, the process of generating a snapshot of the database based on all the data storage engine instances includes:

the snapshot preparation flow comprises the following steps: the data storage engine instance calls to attempt to acquire a lock tryLock to lock a task lock and a swap lock swapLock, if the task lock and the swap lock swapLock are successfully locked, the snapshot preparation flow is successful, otherwise, the snapshot preparation flow fails, and the snapshot preparation flow is specifically shown in fig. 3.

According to the data processing method applied to the database, aiming at more complex snapshot atomization operation which is faced by a multi-storage engine embodiment, hard links are used for accelerating the atomization process, locks are added to be matched with a temporary database dbBuffer and the like for use, so that the snapshot is generated more quickly, lock conflicts are less likely to occur, and the data consistency of a plurality of storage engines is ensured; and additionally, the storage engine instance based on the failure of the snapshot preparation flow is rolled back necessarily, so that the safe execution of atomization is ensured.

Specifically, in the data processing method applied to the database of the embodiment, when the execution of the snapshot preparation flow by a single data storage engine instance fails, a preparation failure return instruction is reported;

the data processing method comprises the following steps:

Specifically, when a certain data storage engine instance receives a preparation failure return instruction, a preparation rollback flow PrepareRo l l back is executed, as shown in fig. 4, to unlock the task lock and the swap lock swapLock in sequence.

The snapshot flow of multiple data storage engine instances is shown in fig. 5 (taking two data storage engine instances as an example, the data storage engine instance db1 and the data storage engine instance db 2) and the core is that after db2 fails to make a preparation (snapshot preparation flow), prepareRo l l back of the last data storage engine instance (data storage engine instance db 1) is called to release the lock of the data storage engine instance db1, otherwise, the lock of the data storage engine instance db1 is not released, deadlock is generated, and more data storage engine instances analogize in turn, so that the safe execution of atomization is ensured.

Example III

The embodiment also provides a data processing device applied to a database, including:

In the data processing device applied to the database of the embodiment, in the process that the data storage engine executes the data compression transaction for the database, the database switching module switches the working database into the temporary database dbBuffer to accept the data update operation during the execution of the data compression transaction/snapshot transaction. Thus, when data compression is performed, an update operation for a certain key is written into the temporary database file, and a new update operation is only applied to the temporary database. At the same time, a database file for compression is created, and the original database is kept in a read-only state. In the data compression process, the data are sequentially read from the original database and written into the compressed database, the compressed database can be effective after the compression operation is completed, and the original database can be deleted, so that the data compression process is not influenced by the update operation. In addition, the read operation during compression prioritizes reading the temporary database because the values in the temporary database are always newer than the values of the original database.

In the data processing device applied to the database of the embodiment, in the process that the data storage engine executes snapshot transactions aiming at the database, the snapshot adopts a hard-link replication mode to realize second-level snapshot generation. In the process of snapshot generation, the database switching module switches the working database into a temporary database dbBuffer to accept data updating operation during the execution of data compression transaction/snapshot transaction. When the snapshot is generated, the data is updated and written into the temporary database file, the original database is not updated, the transmitted copy file is always consistent with the original database file based on the data after checking points, and the update data after checking nodes cannot appear in the copy file, so that the snapshot file is ensured to be a correct file.

Therefore, the data processing device applied to the database solves the data security problem of snapshot generation/transmission under large data volume, ensures that snapshot generation time can be completed in second level, ensures that data compression is a safe behavior, and cannot cause data abnormality by new updating operation. The data processing method applied to the database in the embodiment aims at the data security problem existing in the data compression and snapshot, and establishes a temporary database dbBuffer to distinguish newly written data from existing data, so that the data compression and snapshot operations are safer and more reliable.

Since the data processing device applied to the database of the embodiment needs to switch the working database when executing the data compression transaction and/or the snapshot transaction, in order to ensure the stability and reliability of the data storage engine in the process of switching the database, as an optional implementation manner of the embodiment, the data processing device applied to the database of the embodiment is provided with the exchange lock module inside a single storage engine of the database, and the exchange lock module performs the exchange lock swapLock locking operation before the database switching operation. The data storage engine of the embodiment realizes the database switching process through the cooperation of the reference exchange lock swapLock, thereby ensuring the safety and reliability of the database switching process.

Specifically, when the storage engine adds the swap lock swapLock to the current working database, the database no longer receives data update operations, and the newly updated data is written into the temporary database dbBuffer.

As an optional implementation manner of the present embodiment, in the data processing apparatus of the present embodiment, the single storage engine of the database has a task lock module, and the execution flow of the data compression transaction includes:

the task lock module performs task lock locking operation, and the database switching module switches the working database into a temporary database dbBuffer;

the database switching module switches the working database into the current database, and the task lock module unlocks the task lock.

Specifically, during the data compression process, after the task lock taskLock is added to the current working database by the storage engine, the current working database is locked, and operations other than the data compression operation are no longer allowed to be performed.

Further, in the data processing apparatus for a database according to the present embodiment, in a data compression process, the database switching module switches a working database to a temporary database includes: the exchange lock module performs exchange lock swapLock locking operation, the database switching module switches the working database into a temporary database, and after the switching is completed, the exchange lock module unlocks the exchange lock swapLock;

the database switching module switching the working database to the current database comprises: and the exchange lock module performs exchange lock swapLock locking operation, the database switching module switches the working database into the current database, and the exchange lock module unlocks the exchange lock swapLock after the switching is completed.

As an optional implementation manner of the present embodiment, the data processing apparatus applied to the database of the present embodiment has an attempt to acquire a lock module inside a single data storage engine of the database, and the execution flow of the snapshot transaction includes:

the attempt acquisition lock module calls the attempt acquisition lock tryLock to attempt locking aiming at task lock;

The lock attempting module calls an attempt to acquire a lock tryLock to attempt locking for a swap lock swapLock, and after the swap lock swapLock is successfully locked, the working database is switched into a temporary database;

The data processing device applied to the database in the embodiment adopts the hard link mode for the scheme of generating the snapshot, thus ensuring that the snapshot is generated much faster than the copy, and simultaneously using the temporary database dbBuffer to make buffering and receiving writing operation, so that the original database file is not modified in the whole snapshot generating and transmitting process, and the data is complete and safe. Meanwhile, based on the call of attempting to acquire the lock tryLock, the difference between tryLock and lock is that if the lock is not locked, the lock is immediately locked, and if other tasks have already locked the lock, an error is returned, so that the attempt to acquire the lock tryLock can quickly know whether the lock is locked or not for the lock to be used, rather than waiting for acquiring the lock, the snapshot is generated to obtain a result more quickly, and the fact that the reading and writing of the whole database are affected as little as possible is ensured.

Further, the data processing apparatus applied to a database of the present embodiment, wherein the attempting to acquire the lock module invoking the attempt to acquire the lock tryLock to lock the swap lock swapLock includes:

the attempt acquisition lock module calls an attempt acquisition lock tryLock to acquire the current state of the exchange lock swapLock;

and after the attempt to acquire the lock try to lock fails, the attempt to acquire the lock module executes the attempt to lock operation again at regular intervals until the attempt to lock the object is successful or the snapshot operation is cancelled.

The data processing device applied to the database in this embodiment is a batch refreshing process, and the data storage engine will write the request into the memory buffer area first, and refresh the request into the b+ tree in batch after the memory buffer area is full.

Therefore, the data processing device applied to the database of the present embodiment performs a data update operation of updating the request data in the memory buffer to the current working database (possibly the original database or the temporary database dbBuffer) after detecting that the request data in the memory buffer reaches the storage threshold;

The exchange lock module performs exchange lock swapLock locking operation to obtain a current working database;

and writing the request data in the memory cache area into the current working database in batches, and unlocking the exchange lock swapLock by the exchange lock module after the writing is completed.

In the data processing device applied to the database in this embodiment, in the snapshot/data compression process, a temporary dbBuffer is generated, and after the execution of the data compression transaction and/or the snapshot transaction is completed, the request data written by the data update operation in the temporary database is written back to the current working database. The process is completed asynchronously, and before the write-back is completed, the read-write operation needs to check the data of the temporary dbBuffer.

If a single database contains multiple storage engine instances, then data compression is non-conflicting. However, when generating snapshots based on multiple storage engine instances, the process coordinates each engine instance, locking all storage engine instances to ensure that the snapshots are generated based on a common checkpoint. Thus, the data processing apparatus applied to a database of the present embodiment, when a single database contains a plurality of data storage engine instances, the process of generating a snapshot of the database based on all the data storage engine instances includes:

Aiming at more complex snapshot atomization operation faced by a multi-storage engine embodiment, the invention uses hard links to accelerate the atomization process, and adds a lock, a dbBuffer and other devices to be used together in a matched manner, so that the snapshot is generated more quickly, the lock conflict is less likely to occur, and the data consistency of a plurality of storage engines is ensured; and additionally, the storage engine instance based on the failure of the snapshot preparation flow is rolled back necessarily, so that the safe execution of atomization is ensured.

Specifically, when the execution of the snapshot preparation flow by the single data storage engine instance fails, the data processing device applied to the database of the embodiment reports a preparation failure return instruction;

the data processing method comprises the following steps:

Example IV

The present embodiment also provides a computer-readable storage medium storing a computer-executable program which, when executed, implements the data processing method applied to a database as described above.

The computer readable storage medium of this embodiment may include a data signal propagated in baseband or as part of a carrier wave, with readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable storage medium may also be any readable medium that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable storage medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

The present embodiment also provides an electronic device including a processor and a memory for storing a computer executable program, which when executed by the processor performs the data processing method applied to a database.

The electronic device is in the form of a general purpose computing device. The processor may be one or a plurality of processors and work cooperatively. The invention does not exclude that the distributed processing is performed, i.e. the processor may be distributed among different physical devices. The electronic device of the present invention is not limited to a single entity, but may be a sum of a plurality of entity devices.

The memory stores a computer executable program, typically machine readable code. The computer readable program may be executable by the processor to enable an electronic device to perform the method, or at least some of the steps of the method, of the present invention.

The memory includes volatile memory, such as Random Access Memory (RAM) and/or cache memory, and may be non-volatile memory, such as Read Only Memory (ROM).

It should be understood that elements or components not shown in the above examples may also be included in the electronic device of the present invention. For example, some electronic devices further include a display unit such as a display screen, and some electronic devices further include a man-machine interaction element such as a button, a keyboard, and the like. The electronic device may be considered as covered by the invention as long as the electronic device is capable of executing a computer readable program in a memory for carrying out the method or at least part of the steps of the method.

From the above description of embodiments, those skilled in the art will readily appreciate that the present invention may be implemented by hardware capable of executing a specific computer program, such as the system of the present invention, as well as electronic processing units, servers, clients, handsets, control units, processors, etc. included in the system. The invention may also be implemented by computer software executing the method of the invention, e.g. by control software executed by a microprocessor, an electronic control unit, a client, a server, etc. It should be noted, however, that the computer software for performing the method of the present invention is not limited to being executed by one or a specific hardware entity, but may also be implemented in a distributed manner by unspecified specific hardware. For computer software, the software product may be stored on a computer readable storage medium (which may be a CD-ROM, a usb disk, a removable hard disk, etc.), or may be stored distributed over a network, as long as it enables the electronic device to perform the method according to the invention.

The above embodiments are only for illustrating the present invention and not for limiting the technical solutions described in the present invention, and although the present invention has been described in detail in the present specification with reference to the above embodiments, the present invention is not limited to the above specific embodiments, and thus any modifications or equivalent substitutions are made to the present invention; all technical solutions and modifications thereof that do not depart from the spirit and scope of the invention are intended to be included in the scope of the appended claims.

Claims

1. A data processing method applied to a database, comprising:

2. The data processing method applied to a database according to claim 1, wherein a single storage engine of the database has a swap lock swapLock inside, and the storage engine performs the swap lock swapLock locking operation before performing the switchover operation of the working database.

3. The method for processing data applied to a database according to claim 2, wherein a task lock is provided in a single storage engine of the database, and the execution flow of the data compression transaction includes:

the data storage engine performs task lock locking operation and switches the working database into a temporary database;

and switching the working database to be the current database, and unlocking the task lock tasklock.

4. A data processing method applied to a database according to claim 3, wherein said switching the working database to the temporary database comprises:

5. The method for processing data applied to a database according to claim 2, wherein a single data storage engine of the database has an attempt to acquire a lock tryLock therein, and the execution flow of the snapshot transaction includes:

the data storage engine calls the attempt to acquire a lock tryLock to attempt locking aiming at a task lock tasklock;

and executing snapshot operation aiming at the current database, and unlocking a task lock task after the snapshot is completed.

6. The method of claim 5, wherein the attempting to acquire a lock tryLock comprises attempting to lock against a swap lock swapLock:

7. The data processing method applied to a database according to claim 5, wherein when a single database contains a plurality of data storage engine instances, the process of generating a snapshot of the database based on all data storage engine instances comprises:

the snapshot preparation flow comprises the following steps: and the data storage engine instance calls an attempt to acquire a lock tryLock to lock a task lock and a swap lock swapLock, if the task lock and the swap lock swapLock are successfully locked, the snapshot preparation flow is successful, and otherwise, the snapshot preparation flow fails.

8. The method for processing data applied to a database according to claim 7, wherein when the execution of the snapshot preparation process by the single data storage engine instance fails, a preparation failure return instruction is reported;

the data processing method comprises the following steps:

When the snapshot preparation flow fails to be executed by a certain data storage engine instance, a preparation failure return instruction is reported, the task lock tasklock and the exchange lock swapLock are unlocked after the previous data storage engine instance receives the preparation failure return instruction, the state of the snapshot preparation flow is changed into the snapshot preparation flow failure, the preparation failure return instruction is reported to the next previous data storage engine instance, and the data storage engine instance is sequentially executed until all the snapshot preparation flow succeeds, and the task lock tasklock and the exchange lock swapLock are unlocked.

9. The data processing method applied to a database according to claim 2, comprising:

10. The method according to claim 1, wherein after the data compression transaction and/or the snapshot transaction are executed, the request data written by the data update operation in the temporary database is written back to the current working database.

11. A data processing apparatus for use with a database, comprising:

12. An electronic device comprising a processor and a memory for storing a computer executable program which, when executed by the processor, performs a data processing method as claimed in any one of claims 1 to 10 applied to a database.

13. A computer-readable storage medium, in which a computer-executable program is stored, which, when executed, implements the data processing method applied to a database according to any one of claims 1 to 10.