WO2021184996A1 - Data storage method and apparatus for database - Google Patents

Data storage method and apparatus for database Download PDF

Info

Publication number
WO2021184996A1
WO2021184996A1 PCT/CN2021/075501 CN2021075501W WO2021184996A1 WO 2021184996 A1 WO2021184996 A1 WO 2021184996A1 CN 2021075501 W CN2021075501 W CN 2021075501W WO 2021184996 A1 WO2021184996 A1 WO 2021184996A1
Authority
WO
WIPO (PCT)
Prior art keywords
memory
data
database
log
sorting
Prior art date
Application number
PCT/CN2021/075501
Other languages
French (fr)
Chinese (zh)
Inventor
熊刚
许友松
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2021184996A1 publication Critical patent/WO2021184996A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/219Managing data history or versioning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors

Definitions

  • This application relates to the field of communication technology, and in particular to a data storage method and device for a database.
  • the files involved in the database such as the log that records data changes in the database, the sorted files generated by the execution of efflux, etc., are usually placed on the same disk, such as all on the local disk or on the cloud disk.
  • RDMA remote direct memory access
  • the present application provides a data storage method and device for a database, so as to provide an efficient and low-maintenance database data storage method.
  • the embodiments of the present application provide a data storage method for a database, which is executed by a data processing device.
  • the data processing device can record a first log when a data change in the database is detected.
  • a log is preferentially stored in the first memory.
  • the data processing device determines that the storage space in the first memory is insufficient, for example, less than the first threshold, it can switch to the second memory to continue recording the first log, the first memory and the second memory Deploy in different devices.
  • the log of the database is no longer limited to the local storage.
  • the storage space in the first storage is insufficient, it can be switched to other storage to continue to store the log, making the data storage method of the database more efficient and multiple storages can be deployed In different devices, large local disks are no longer needed, which can effectively reduce the disk maintenance cost of the database.
  • the sorting file generated by the database during the sorting process may also be stored in a similar manner.
  • the data processing device when the data processing device receives a database sorting request, it can sort the data in the database according to the received database sorting request; the data processing device can also generate a sorting file in the process of sorting the data.
  • the sorted file may be stored in the first memory, and when the data processing apparatus determines that the storage space of the first memory is less than the second threshold, it may switch to the second memory and continue to save the second memory.
  • the above method is only an example of receiving a database sorting request.
  • the data processing device will also sort the data in the database to generate a sorting file.
  • the sorting file can also be stored in the above-mentioned manner.
  • the sorting files generated during the data sorting process in the database can be stored in multiple memories, which improves the data storage efficiency of the database, and the multiple memories are distributed and deployed, reducing the maintenance cost of the local disk.
  • extended memory and memory can also be set.
  • the extended memory and memory can be used to store data with a higher read and write frequency in the database.
  • the extended memory and memory can be deployed in the first memory. It may also be another memory independent of the first memory.
  • the data processing device may store data in the database with a read and write frequency greater than the third threshold in the extended memory based on the read and write frequency of the data in the database; when the data processing device needs to read the data in the extended memory , The data can be read from the extended memory to the internal memory, and the data processing device can read the data from the internal memory.
  • the setting of the extended memory can effectively expand the storage space of the memory, so that more data can be stored in the extended memory, and ensure that these data can be efficiently read and written.
  • the data processing device can also eliminate the data with low read and write frequency in the extended memory in time.
  • the data processing device first detects the read and write frequency of the data in the extended memory; if the read and write frequency is stored in the extended memory When the data is lower than the third threshold, the data whose read and write frequency is lower than the third threshold in the extended memory can be eliminated.
  • the data processing device can eliminate the cold data in the extended memory in time, and effectively utilize the storage space in the extended memory.
  • the data processing device may also back up the first log stored in the first storage and the second storage. After the first log is backed up, the first log may be cleared; afterwards, if the database is If there is a data change, the data processing device may continue to record the second log of the database in the first memory according to the data change of the database.
  • the log when there is free storage space in the first storage (such as clearing the first log), the log can be continued to be stored in the first storage to effectively utilize the storage space of the first storage.
  • the data processing device may delete the sorting file after sorting the data in the database according to the sorting file.
  • the data processing device deletes the sorted files in time, which can ensure that there is free storage space in the first memory and the second memory so as to store other valid data.
  • the data processing apparatus may also record the correspondence between the data page and the data page identifier in the extended memory. For example, save the correspondence in the non-volatile memory, and when the device where the first memory is located is restarted Afterwards, the data processing device may reorganize the extended memory in the first memory according to the corresponding relationship.
  • the data processing device can quickly organize and expand the memory after the device is restarted.
  • the embodiments of the present application also provide a data processing device, and the beneficial effects can be referred to the description of the first aspect and will not be repeated here.
  • the device has the function of realizing the behavior in the method example of the first aspect described above.
  • the function can be realized by hardware, or by hardware executing corresponding software.
  • the hardware or software includes one or more modules corresponding to the above-mentioned functions.
  • the structure of the device includes a processing unit, a determining unit, and a switching unit. These units can perform the corresponding functions in the above-mentioned method example of the first aspect. For details, please refer to the detailed description in the method example. Do repeat.
  • an embodiment of the present application also provides a computing device.
  • the computing device includes a processor and a memory, and may also include a communication interface.
  • the processor executes the program instructions in the memory to execute the above-mentioned first aspect or
  • the memory is coupled with the processor and stores program instructions and data necessary to perform data synchronization.
  • the communication interface is used to communicate with other devices (such as client devices).
  • the present application provides a computing device system, which includes at least one computing device.
  • Each computing device includes a memory and a processor.
  • the processor of at least one computing device may be used to access the code in the memory to execute the first aspect or the method provided in any possible implementation manner of the first aspect.
  • the present application provides a non-transitory readable storage medium.
  • the non-transitory readable storage medium executes the foregoing first aspect or any of the first aspects. Possible implementation.
  • the storage medium stores the program.
  • the storage medium includes, but is not limited to, volatile memory, such as random access memory, non-volatile memory, such as flash memory, hard disk drive (HDD), and solid state drive (SSD).
  • the present application provides a computing device program product.
  • the computing device program product includes computer instructions. When executed by a computing device, the computing device can execute the foregoing first aspect or any possible aspect of the first aspect. Method to realize.
  • the computer program product may be a software installation package. In the case where the method provided in the foregoing first aspect or any possible implementation of the first aspect needs to be used, the computer program product may be downloaded and executed on a computing device. Program product.
  • Figure 1 is a schematic diagram of the architecture of a system provided by this application.
  • FIG. 2 is a schematic diagram of another system structure provided by this application.
  • FIG. 3 is a schematic diagram of another system structure provided by this application.
  • FIG. 4 is a schematic diagram of a data storage method for a database provided by this application.
  • FIG. 5 is a schematic structural diagram of a data processing device provided by this application.
  • FIG. 6 is a schematic diagram of a computing device provided by an embodiment of this application.
  • FIG. 7 is a schematic diagram of a computing device in a computing device system provided by an embodiment of the application.
  • FIG. 1 it is a schematic diagram of a system structure provided by an embodiment of this application.
  • the system includes a data processing apparatus 100, a first memory 200 and a second memory 300.
  • the first memory 200 may include a volatile memory (volatile memory), such as a random access memory (random access memory, RAM).
  • volatile memory such as a random access memory (random access memory, RAM).
  • the first memory may also include non-volatile memory (NVM), such as read-only memory (ROM), flash memory, hard disk drive (HDD), or solid state drive (solid disk drive). -state drive, SSD).
  • NVM non-volatile memory
  • ROM read-only memory
  • flash memory flash memory
  • HDD hard disk drive
  • solid state drive solid disk drive
  • SSD solid disk drive
  • the first memory may also include a combination of the above types.
  • the first memory 200 is similar to the second memory 300.
  • the first storage 200 and the second storage 300 are deployed in different data centers and located in different devices.
  • the first storage 200 is a local disk
  • the second storage 300 is a cloud disk.
  • the first memory 200 and the second memory 300 are used to store data in the database and related files of the database, such as logs (such as physical logs and logical logs), sort files, and so on.
  • logs such as physical logs and logical logs
  • sort files such as sort files, and so on.
  • the first storage 200 may also include an extended memory and a memory, and the extended memory and the memory are used to store data with a relatively high read and write frequency in the database.
  • the data processing device 100 may be a hardware device, such as a server, a terminal computing device, etc., or a software device, specifically a set of software systems running on a hardware computing device.
  • the embodiment of the present application does not limit the location where the data processing device 100 is deployed. Exemplarily, as shown in FIG.
  • the data processing apparatus 100 may run in a cloud computing device system (including at least one cloud computing device, such as a server, etc., in the embodiment of the present application, the cloud computing device system is a data center), It can also run on an edge computing device system (including at least one edge computing device, such as a server, a desktop computer, etc., the edge computing device system in the embodiment of this application is a data center), or it can run on various terminal computing devices, For example: notebook computers, personal desktop computers, etc.
  • a cloud computing device system including at least one cloud computing device, such as a server, etc., in the embodiment of the present application, the cloud computing device system is a data center
  • an edge computing device system including at least one edge computing device, such as a server, a desktop computer, etc., the edge computing device system in the embodiment of this application is a data center
  • various terminal computing devices For example: notebook computers, personal desktop computers, etc.
  • the data processing device 100 can also be a device composed of multiple parts logically.
  • the data processing device 100 can include a processing unit, a determining unit, and a switching unit.
  • Each component of the data processing device 100 can be deployed in different systems. Or in the server.
  • each part of the device can run in three environments of cloud computing equipment system, edge computing equipment system or terminal computing equipment respectively, and can also run in any two of these three environments. middle.
  • the cloud computing equipment system, the edge computing equipment system and the terminal computing equipment are connected by a communication path, which can communicate and transmit data with each other.
  • the data storage method for the database provided by the embodiment of the present application is executed by the combined parts of the data processing apparatus 100 running in three environments (or any two of the three environments).
  • the first memory 200 can be deployed in the same system or in the same hardware device as the data processing device 100, so that the data processing device 100 can read data from the first memory 200 more efficiently.
  • This application does not limit the second memory 300.
  • the deployment location only needs to ensure that the first storage 200 and the second storage 300 are deployed in different data centers.
  • the data processing device 100 can generate a log according to the change status of the data in the database.
  • the free storage space in the first storage 200 is greater than the first threshold, the generated log is preferentially stored in the first storage 200 As the log is generated and stored, the storage space in the first storage 200 gradually decreases.
  • the data processing device 100 switches to the second storage 300 to continue storing the log. It can be seen from the above that the logs can be stored in at least two different first memories 200, which can effectively improve the utilization of different memories and ensure that the logs of the database can be completely saved.
  • the following is a data storage method for a database provided by an embodiment of the present application with reference to FIG. 4, and the method includes:
  • Step 401 The data processing device 100 detects the data change of the database.
  • Data changes in the database include but are not limited to: insert, delete, and update of data. Among them, data insertion refers to adding new data to the database, data deletion refers to deleting data in the database, and data updating refers to changing one data in the database to another data.
  • Step 402 The data processing device 100 generates a first log according to the data change of the database.
  • the log is used to record the data changes sent by the database of the recorder, such as data insertion, deletion, and update.
  • the data processing device 100 will generate a corresponding log after checking the data change. In this embodiment of the application, it is distinguished from the log generated by the subsequent data processing device 100.
  • the generated log is called the first log.
  • Step 403 When the free storage space in the first memory 200 is not less than the first threshold, the data processing apparatus 100 stores the first log in the first memory 200.
  • Step 404 When the data processing device 100 determines that the free storage space in the first storage 200 is less than the first threshold, it switches to the second storage 300 to continue storing the first log.
  • the storage order of the logs is the first storage 200 -> the second storage 300.
  • the data processing device 100 preferentially saves the first log in the first memory 200, and when the free storage space in the first memory 200 is insufficient (for example, less than the first threshold), it switches to the second memory 300 to continue storing the first log.
  • a fixed file is configured for storing logs.
  • the file can be called a log file.
  • the size of the log file is fixed, usually a preset value.
  • the data processing device 100 generates When the size of the first log reaches the preset value, the database cannot continue to make data changes. At this time, the first log needs to be backed up. After the backup is completed, the first log can be cleared, which means that the first log stored in the first storage 200 and the second storage 300 are deleted.
  • the data processing device 100 can continue to generate a log.
  • the log generated by the data processing device 100 is referred to herein as the second log, which is similar to the first log.
  • the data processing device 100 preferentially saves the second log in the first memory 200, and when the free storage space of the first memory 200 is insufficient, it switches to the second memory 300 to continue to store the second memory 300.
  • the storage method of the log in the first storage 200 and the second storage 300 is introduced.
  • the data processing device 100 may also use a similar storage method for other data in the database (such as sorted files).
  • the data processing device 100 may sort the data in the database according to the received request to feed back the sorted data.
  • a sorting file is generated, and the sorting file records the sequence of the sorted data.
  • the sort file can also record the storage location of the data and the index number of the data.
  • the sorted file can be stored in the first storage 200 first. As the sorted file is stored in the first storage 200, the free storage space in the first storage 200 becomes less and less. When the free storage space in the first storage 200 is insufficient (for example, less than the second threshold), it is switched to the second storage 300 to continue storing the sorted files.
  • the sorting file can be deleted.
  • the storage order of the sorted files is the first memory 200 -> the second memory 300, and the first memory 200 is first stored in the first memory 200, and the second memory 300 is second.
  • the data processing device 100 when the data processing device 100 acrobatics receives a data query request, in order to be able to query the data processing device 100 more quickly, it can also query the data in the database.
  • the sorting is performed to generate a sorting file.
  • the storage method of the sorting file is similar to the foregoing method, and will not be repeated here.
  • the data processing device 100 can store logs and sorted files in the first memory 200 and the second memory 300, and this data storage method is more flexible for data storage and effectively expands the data storage space.
  • the first storage 200 may be a local disk of the database
  • the second storage 300 is deployed in a cloud computing device system, that is, the second storage 300 is a cloud disk
  • related files or data of the database may be extended to Cloud storage
  • the files involved in the database can be distributed on local disks and cloud disks.
  • This storage method combines the scalability of cloud disks and can also effectively save maintenance costs.
  • the embodiment of the present application only takes two memories included in the system as an example. In some scenarios, a larger number of memories may be included, and the location of data storage is more flexible, and the utilization rate of the memory can also be effectively improved. .
  • the embodiment of the present application can also expand the memory (buffer pool) in the device where the database is deployed, for example, an extended memory (extend buffer pool) is set in the first memory.
  • the read and write frequency of data in the database is usually different.
  • the memory can place the read and write frequency in the database. Higher data.
  • the frequency of reading and writing data in the memory is relatively high, and the data processing device 100 can obtain data from the memory first, which can effectively improve the read and write data of the data, and further improve the processing efficiency of the data processing device 100.
  • the memory space is usually limited, and it is not possible to store all data with higher read and write frequency in the memory.
  • the data stored in the memory may also be updated, which will eliminate some data with higher read and write frequency, resulting in data processing devices.
  • 100 needs to load these data from the second memory 300, and for this purpose, an extended memory can be added.
  • the extended memory can store data whose data read/write frequency is greater than the third threshold, in order to make the extended memory store data whose data read/write frequency is greater than the third threshold.
  • the data processing device 100 can periodically update the data in the extended memory, store data with a data read and write frequency greater than the third threshold in the extended memory, and can also eliminate data stored in the extended memory with a read and write frequency lower than the third threshold.
  • the data processing device 100 migrates data into the extended memory, it can first locate the data to be migrated.
  • the data processing device 100 may use the data in the internal memory as the data to be migrated in.
  • the data in the database is usually organized in fixed-size data units, such as data pages.
  • the size of data stored in each data page (page) is the same, and each data page is configured with an identity (ID).
  • ID is configured with an identity (ID).
  • the data processing device 100 can realize rapid positioning of the data page through the identification of each data page in the memory.
  • a hash map may be used to organize the identification of the data page and the data page, and the data processing apparatus 100 may be based on the hash table and according to the data The identification of the page locates to the data page.
  • the data processing device 100 When the data processing device 100 eliminates data from the extended memory, it can first locate the data to be moved out. For example, the data processing apparatus 100 can use a page replacement algorithm, such as least recently used (LRU), to determine which data page that the extended memory finds is less used, and the data in the data page is the data to be migrated out.
  • LRU least recently used
  • the data processing device 100 When the data processing device 100 needs to read data in the database, it can first obtain the data from the memory or the extended memory.
  • the extended memory further expands the amount of data that the memory can store, ensuring more read and write frequencies.
  • the data of can be stored in the first memory, which can improve the efficiency of data reading and writing.
  • the data processing apparatus 100 may also record the corresponding relationship between the data page in the extended memory and the identifier of the data page, and store the corresponding relationship in the non-volatile memory.
  • the first device restarts At the time, you can directly call the previously saved correspondence to reorganize the extended memory so that you can quickly locate the data page stored in the extended memory.
  • an embodiment of the present application also provides a data processing device, which is configured to execute the method performed by the data processing device 100 in the method embodiment shown in FIG. 4.
  • the data processing device 500 includes a processing unit 501, a determining unit 502, and a switching unit 503, and the aforementioned modules may be software modules.
  • a connection is established between units through a communication path.
  • the processing unit 501 is configured to record the first log of the database in the first memory according to the data change of the database.
  • the processing unit 501 may execute steps 401 to 403 as shown in FIG. 4.
  • the determining unit 502 is configured to determine that the storage space of the first memory is less than a first threshold.
  • the determining unit 502 may execute the method of determining that the storage space of the first memory is less than the first threshold in step 404 shown in FIG. 4.
  • the switching unit 503 is configured to switch the processing unit 501 to the first log of the second memory recording database after the determining unit 502 determines that the storage space of the first memory is less than the first threshold, where the first memory and the second memory are deployed and In different devices.
  • the switching unit 503 may execute the method of exactly switching to the second memory to record the first log in step 404 as shown in FIG. 4.
  • the device further includes a sorting unit 504.
  • the sorting unit 504 can sort the data in the database according to the received database sorting request; the processing unit 501 saves the sorting file generated during the sorting process in the first memory;
  • the switching unit 503 may cause the processing unit 501 to switch to the second memory to save the second memory.
  • the first memory includes extended memory and memory
  • the processing unit 501 can store the data in the database with a read and write frequency greater than the third threshold in the read and write frequency of the data in the database, and read the data in the extended memory to the memory.
  • the processing unit 501 may also eliminate data whose read/write frequency is lower than the third threshold in the extended memory.
  • the processing unit 501 may also clear the first log after the first log is backed up; according to the data change of the database, the first log is changed in the first memory. Record the second log in the database.
  • the sorting file may be deleted after the sorting unit 504 finishes sorting the data in the database according to the sorting file.
  • the processing unit 501 may also record the corresponding relationship between the data page and the data identifier in the extended memory, and reorganize the extended memory according to the corresponding relationship after the device where the first memory is located is restarted.
  • the division of modules in the embodiments of this application is illustrative, and it is only a logical function division. In actual implementation, there may be other division methods.
  • the functional modules in the various embodiments of this application can be integrated into one process. In the device, it can also exist alone physically, or two or more modules can be integrated into one module.
  • the above-mentioned integrated modules can be implemented in the form of hardware or software functional modules.
  • the integrated module is implemented in the form of a software function module and sold or used as an independent product, it can be stored in a computer readable storage medium.
  • the technical solution of the present application essentially or the part that contributes to the existing technology or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , Including several instructions to enable a terminal device (which may be a personal computer, a mobile phone, or a network device, etc.) or a processor to execute all or part of the steps of the method in each embodiment of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (read-only memory, ROM), random access memory (random access memory, RAM), magnetic disks or optical disks and other media that can store program codes. .
  • the computing device 600 is shown in FIG. 6.
  • the computing device 600 includes a bus 601, a processor 602, a communication interface 603, and a memory 604.
  • the processor 602, the memory 604, and the communication interface 603 communicate through a bus 601.
  • the processor 602 may be a central processing unit (CPU).
  • the memory 604 may include a volatile memory (volatile memory), such as a random access memory (random access memory, RAM).
  • the memory 604 may also include a non-volatile memory (non-volatile memory), such as read-only memory (ROM), flash memory, HDD or SSD.
  • Executable code is stored in the memory, and the processor 602 executes the executable code to execute the aforementioned data storage method (the method shown in FIG. 4).
  • the memory 604 may also include an operating system and other software modules required for running processes.
  • the operating system can be LINUX TM , UNIX TM , WINDOWS TM etc.
  • the memory 604 stores the modules in the aforementioned data processing apparatus 500.
  • the memory 604 may also include other software modules required for running processes such as an operating system.
  • the operating system can be LINUX TM , UNIX TM , WINDOWS TM etc.
  • the present application also provides a computing device system.
  • the computing device system includes at least one computing device 700 as shown in FIG. 7.
  • the computing device 700 includes a bus 701, a processor 702, a communication interface 703, and a memory 704.
  • the processor 702, the memory 704, and the communication interface 703 communicate through a bus 701.
  • At least one computing device 700 in the computing device system communicates with each other through a communication path.
  • the processor 702 may be a CPU.
  • the memory 704 may include volatile memory, such as random access memory.
  • the memory 704 may also include non-volatile memory, such as read-only memory, flash memory, HDD or SSD.
  • Executable code is stored in the memory 704, and the processor 702 executes the executable code to perform any part or all of the aforementioned data synchronization method.
  • the memory may also include an operating system and other software modules required for running processes.
  • the operating system can be LINUX, UNIX, WINDOWS TM, etc.
  • any one or any multiple modules of the aforementioned data processing apparatus 500 are stored in the memory 704.
  • the memory 704 may also include other software modules required for running processes, such as an operating system.
  • the operating system can be LINUX, UNIX, WINDOWS TM, etc.
  • At least one computing device 700 in the computing device system establishes communication with each other through a communication network, and each computing device runs any one or any multiple units in the data processing apparatus 500. At least one computing device 700 collectively performs the aforementioned data storage operation.
  • the computer program product for data synchronization includes one or more computer program instructions for data synchronization.
  • the data synchronization program according to the embodiment of the present invention is generated in whole or in part. Process or function.
  • the computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices.
  • the computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium.
  • the computer instructions may be transmitted from a website, computer, server, or data center. Transmission to another website, computer, server, or data center via wired (such as coaxial cable, optical fiber, digital subscriber line, or wireless (such as infrared, wireless, microwave, etc.)).
  • the computer-readable storage medium includes A readable storage medium of computer program instructions for classification model training.
  • the computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server or data center integrated with one or more available media.
  • the usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, a magnetic tape), an optical medium (for example, a DVD), or a semiconductor medium (for example, an SSD).

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Disclosed are a data storage method and apparatus for a database, wherein same are used for providing a data storage means, which is efficient and has low maintenance costs, for a database. The method comprises: when it is detected that data in a database changes (401), a data processing apparatus recording a first log (402); the first log being preferentially stored in a first memory (403); and when the data processing apparatus determines that the storage space in the first memory is insufficient, for example, less than a first threshold value, switching to a second memory to continue to record the first log (404), wherein the first memory and the second memory are deployed in different devices. A log of a database is no longer confined to a local memory, and when the storage space in a first memory is insufficient, switching to another memory can be performed to continue to store the log, such that the way in which the database stores data is more efficient. Multiple memories can be deployed in different devices, and no large local disk is required, so that the disk maintenance costs of the database can be effectively reduced.

Description

一种针对数据库的数据存储方法及装置Data storage method and device for database 技术领域Technical field
本申请涉及通信技术领域,尤其涉及一种针对数据库的数据存储方法及装置。This application relates to the field of communication technology, and in particular to a data storage method and device for a database.
背景技术Background technique
目前,数据库所涉及的文件,如记录数据库中数据变更的日志、执行外排所产生的排序文件等,通常放置在同一磁盘中,如均放在本地磁盘中,或放置在云盘中。At present, the files involved in the database, such as the log that records data changes in the database, the sorted files generated by the execution of efflux, etc., are usually placed on the same disk, such as all on the local disk or on the cloud disk.
对于数据库服务所涉及的文件放置在本地磁盘的情况,需要较多本地磁盘空间,增加了维护成本。For the case where the files involved in the database service are placed on the local disk, more local disk space is required, which increases the maintenance cost.
对于数据库服务所涉及的文件放置在云盘的情况,在读取云盘中的数据需要借助远程直接数据存取(remote direct memory access,RDMA)技术,数据读写效率较差。For the case where the files involved in the database service are placed on a cloud disk, remote direct memory access (RDMA) technology is required to read the data in the cloud disk, and the efficiency of data reading and writing is poor.
综上,亟需一种高效、且维护成本较低的数据库的数据存储方式。In summary, there is an urgent need for an efficient and low-maintenance database data storage method.
发明内容Summary of the invention
本申请提供一种针对数据库的数据存储方法及装置,用以提供一种高效且维护成本较低的数据库的数据存储方式。The present application provides a data storage method and device for a database, so as to provide an efficient and low-maintenance database data storage method.
第一方面,本申请实施例提供了一种针对数据库的数据存储方法,该方法由数据处理装置执行,该方法中:数据处理装置能够在检测到数据库的数据变更时,记录第一日志,第一日志优先存储在第一存储器中,当数据处理装置在确定第一存储器中的存储空间不足,例如小于第一阈值,可以切换至第二存储器继续记录第一日志,第一存储器和第二存储器部署在不同的设备中。In the first aspect, the embodiments of the present application provide a data storage method for a database, which is executed by a data processing device. In the method: the data processing device can record a first log when a data change in the database is detected. A log is preferentially stored in the first memory. When the data processing device determines that the storage space in the first memory is insufficient, for example, less than the first threshold, it can switch to the second memory to continue recording the first log, the first memory and the second memory Deploy in different devices.
通过上述方法,数据库的日志不再局限在本地存储器中,当第一存储器中的存储空间不足时,可以切换至其他存储器继续存储该日志,使得数据库的数据存储方式更加高效,多个存储器可以部署在不同的设备中,不再需要较大的本地磁盘,可以有效降低数据库的磁盘维护成本。Through the above method, the log of the database is no longer limited to the local storage. When the storage space in the first storage is insufficient, it can be switched to other storage to continue to store the log, making the data storage method of the database more efficient and multiple storages can be deployed In different devices, large local disks are no longer needed, which can effectively reduce the disk maintenance cost of the database.
在一种可能的实现方式中,除了数据库的日志,数据库在排序过程中生成的排序文件也可以采用类似的存储方式。示例性的,当数据处理装置在接收到数据库排序请求时,可以根据接收到的数据库排序请求,对数据库中的数据进行排序;数据处理装置在对数据进行排序的过程中,还可以生成排序文件,该排序文件可以保存在第一存储器中,当数据处理装置确定第一存储器的存储空间小于第二阈值时,可以切换至所述第二存储器继续保存所述第二存储器。上述方式仅是以接收到数据库排序请求为例进行说明,当在一些场景中,如需要查询数据库中的数据时,数据处理装置也会对数据库中的数据进行排序,生成排序文件,该排序文件也可以采用上述方式存储该排序文件。In a possible implementation manner, in addition to the log of the database, the sorting file generated by the database during the sorting process may also be stored in a similar manner. Exemplarily, when the data processing device receives a database sorting request, it can sort the data in the database according to the received database sorting request; the data processing device can also generate a sorting file in the process of sorting the data. The sorted file may be stored in the first memory, and when the data processing apparatus determines that the storage space of the first memory is less than the second threshold, it may switch to the second memory and continue to save the second memory. The above method is only an example of receiving a database sorting request. In some scenarios, when data in the database needs to be queried, the data processing device will also sort the data in the database to generate a sorting file. The sorting file can also be stored in the above-mentioned manner.
通过上述方法,在数据库中数据排序过程中生成的排序文件可以存储在多个存储器中,提高了数据库的数据存储效率,多个存储器分布部署,减少了本地磁盘的维护成本。Through the above method, the sorting files generated during the data sorting process in the database can be stored in multiple memories, which improves the data storage efficiency of the database, and the multiple memories are distributed and deployed, reducing the maintenance cost of the local disk.
在一种可能的实现方式中,针对数据库,还可以设置扩展内存和内存,扩展内存和内存能够用于存储数据库中读写频率较高的数据,扩展内存和内存可以部署在第一存储器中,也可以是独立与第一存储器的其他存储器。数据处理装置可以基于所述数据库中 数据的读写频率,将数据库的数据中读写频率大于第三阈值的数据保存在所述扩展内存中;在数据处理装置需要读取扩展内存中的数据时,可以将该数据从扩展内存读取至内存,数据处理装置从内存中读取该数据。In a possible implementation manner, for the database, extended memory and memory can also be set. The extended memory and memory can be used to store data with a higher read and write frequency in the database. The extended memory and memory can be deployed in the first memory. It may also be another memory independent of the first memory. The data processing device may store data in the database with a read and write frequency greater than the third threshold in the extended memory based on the read and write frequency of the data in the database; when the data processing device needs to read the data in the extended memory , The data can be read from the extended memory to the internal memory, and the data processing device can read the data from the internal memory.
通过上述方法,扩展内存的设置能够有效扩展内存的存储空间,使得更多的数据可以存储在扩展内存,保证这些数据能够被高效读写。Through the above method, the setting of the extended memory can effectively expand the storage space of the memory, so that more data can be stored in the extended memory, and ensure that these data can be efficiently read and written.
在一种可能的实现方式中,数据处理装置还可以及时淘汰扩展内存中,读写频率较低的数据,数据处理装置先检测扩展内存中数据的读写频率;若扩展内存中存储读写频率低于第三阈值的数据时,可以将扩展内存中读写频率低于第三阈值的数据淘汰。In a possible implementation, the data processing device can also eliminate the data with low read and write frequency in the extended memory in time. The data processing device first detects the read and write frequency of the data in the extended memory; if the read and write frequency is stored in the extended memory When the data is lower than the third threshold, the data whose read and write frequency is lower than the third threshold in the extended memory can be eliminated.
通过上述方法,数据处理装置可及时淘汰扩展内存中的冷数据,有效利用扩展内存中的存储空间。Through the above method, the data processing device can eliminate the cold data in the extended memory in time, and effectively utilize the storage space in the extended memory.
在一种可能的实现方式中,数据处理装置还可以对第一存储器和第二存储器中存储的第一日志进行备份,在对第一日志备份完成后,可以清除第一日志;之后若数据库中存在数据变更,数据处理装置可以继续根据所述数据库的数据变更在第一存储器中记录所述数据库的第二日志。In a possible implementation manner, the data processing device may also back up the first log stored in the first storage and the second storage. After the first log is backed up, the first log may be cleared; afterwards, if the database is If there is a data change, the data processing device may continue to record the second log of the database in the first memory according to the data change of the database.
通过上述方法,当第一存储器中存在空闲的存储空间(如清除第一日志)后,可以继续在第一存储器中存储日志,以有效利用第一存储器的存储空间。Through the above method, when there is free storage space in the first storage (such as clearing the first log), the log can be continued to be stored in the first storage to effectively utilize the storage space of the first storage.
在一种可能的实现方式中,数据处理装置在根据排序文件对所述数据库中的数据排序完成后,可以删除排序文件。In a possible implementation manner, the data processing device may delete the sorting file after sorting the data in the database according to the sorting file.
通过上述方法,数据处理装置及时删除排序文件,可以保证第一存储器和第二存储器中能够有空闲的存储空间,以便存储其他有效数据。Through the above method, the data processing device deletes the sorted files in time, which can ensure that there is free storage space in the first memory and the second memory so as to store other valid data.
在一种可能的实现方式中,数据处理装置还可以记录扩展内存中数据页与数据页标识的对应关系,如将该对应关系保存在非易失性内存中,当第一存储器所在的设备重启后,数据处理装置可以根据该对应关系重组第一存储器中的扩展内存。In a possible implementation, the data processing apparatus may also record the correspondence between the data page and the data page identifier in the extended memory. For example, save the correspondence in the non-volatile memory, and when the device where the first memory is located is restarted Afterwards, the data processing device may reorganize the extended memory in the first memory according to the corresponding relationship.
通过上述方法,数据处理装置可以在设备重启后,快速的组织扩展内存。Through the above method, the data processing device can quickly organize and expand the memory after the device is restarted.
第二方面,本申请实施例还提供了一种数据处理装置,有益效果可以参见第一方面的描述此处不再赘述。该装置具有实现上述第一方面的方法实例中行为的功能。所述功能可以通过硬件实现,也可以通过硬件执行相应的软件实现。所述硬件或软件包括一个或多个与上述功能相对应的模块。在一个可能的设计中,所述装置的结构中包括处理单元、确定单元以及切换单元,这些单元可以执行上述第一方面方法示例中的相应功能,具体参见方法示例中的详细描述,此处不做赘述。In the second aspect, the embodiments of the present application also provide a data processing device, and the beneficial effects can be referred to the description of the first aspect and will not be repeated here. The device has the function of realizing the behavior in the method example of the first aspect described above. The function can be realized by hardware, or by hardware executing corresponding software. The hardware or software includes one or more modules corresponding to the above-mentioned functions. In a possible design, the structure of the device includes a processing unit, a determining unit, and a switching unit. These units can perform the corresponding functions in the above-mentioned method example of the first aspect. For details, please refer to the detailed description in the method example. Do repeat.
第三方面,本申请实施例还提供了一种计算设备,所述计算设备包括处理器和存储器,还可以包括通信接口,所述处理器执行所述存储器中的程序指令执行上述第一方面或第一方面任一可能的实现方式提供的方法,所述存储器与所述处理器耦合,其保存执行数据同步必要的程序指令和数据。所述通信接口,用于与其他设备(如客户端设备)进行通信。In a third aspect, an embodiment of the present application also provides a computing device. The computing device includes a processor and a memory, and may also include a communication interface. The processor executes the program instructions in the memory to execute the above-mentioned first aspect or In the method provided by any possible implementation of the first aspect, the memory is coupled with the processor and stores program instructions and data necessary to perform data synchronization. The communication interface is used to communicate with other devices (such as client devices).
第四方面,本申请提供了一种计算设备系统,该计算设备系统包括至少一个计算设备。每个计算设备包括存储器和处理器。至少一个计算设备的处理器可以用于访问所述存储器中的代码以执行第一方面或第一方面的任意一种可能的实现方式提供的方法。In a fourth aspect, the present application provides a computing device system, which includes at least one computing device. Each computing device includes a memory and a processor. The processor of at least one computing device may be used to access the code in the memory to execute the first aspect or the method provided in any possible implementation manner of the first aspect.
第五方面,本申请提供了一种非瞬态的可读存储介质,所述非瞬态的可读存储介质被计算设备执行时,所述计算设备执行前述第一方面或第一方面的任意可能的实现方式。该存储介质中存储了程序。该存储介质包括但不限于易失性存储器,例如随机访问存储 器,非易失性存储器,例如快闪存储器、硬盘(hard disk drive,HDD)、固态硬盘(solid state drive,SSD)。In a fifth aspect, the present application provides a non-transitory readable storage medium. When the non-transitory readable storage medium is executed by a computing device, the computing device executes the foregoing first aspect or any of the first aspects. Possible implementation. The storage medium stores the program. The storage medium includes, but is not limited to, volatile memory, such as random access memory, non-volatile memory, such as flash memory, hard disk drive (HDD), and solid state drive (SSD).
第六方面,本申请提供了一种计算设备程序产品,所述计算设备程序产品包括计算机指令,在被计算设备执行时,所述计算设备可以执行前述第一方面或第一方面的任意可能的实现方式。该计算机程序产品可以为一个软件安装包,在需要使用前述第一方面或第一方面的任意可能的实现方式中提供的方法的情况下,可以下载该计算机程序产品并在计算设备上执行该计算机程序产品。In a sixth aspect, the present application provides a computing device program product. The computing device program product includes computer instructions. When executed by a computing device, the computing device can execute the foregoing first aspect or any possible aspect of the first aspect. Method to realize. The computer program product may be a software installation package. In the case where the method provided in the foregoing first aspect or any possible implementation of the first aspect needs to be used, the computer program product may be downloaded and executed on a computing device. Program product.
附图说明Description of the drawings
图1为本申请提供的一种系统的架构示意图;Figure 1 is a schematic diagram of the architecture of a system provided by this application;
图2为本申请提供的另一种系统结构示意图;Figure 2 is a schematic diagram of another system structure provided by this application;
图3为本申请提供的另一种系统结构示意图;Figure 3 is a schematic diagram of another system structure provided by this application;
图4为本申请提供的一种针对数据库的数据存储方法示意图;FIG. 4 is a schematic diagram of a data storage method for a database provided by this application;
图5为本申请提供的一种数据处理装置的结构示意图;FIG. 5 is a schematic structural diagram of a data processing device provided by this application;
图6为本申请实施例提供的一种计算设备的示意图;FIG. 6 is a schematic diagram of a computing device provided by an embodiment of this application;
图7为本申请实施例提供的一种计算设备系统中计算设备的示意图。FIG. 7 is a schematic diagram of a computing device in a computing device system provided by an embodiment of the application.
具体实施方式Detailed ways
如图1所示,为本申请实施例提供的一种系统结构示意图,该系统中包括数据处理装置100、第一存储器200和第二存储器300。As shown in FIG. 1, it is a schematic diagram of a system structure provided by an embodiment of this application. The system includes a data processing apparatus 100, a first memory 200 and a second memory 300.
第一存储器200可以包括易失性存储器(volatile memory),例如随机存取存储器(random access memory,RAM)。第一存储器还可以包括非易失性存储器(non-volatile memory,NVM),例如只读存储器(read-only memory,ROM),快闪存储器,硬盘(hard disk drive,HDD)或固态硬盘(solid-state drive,SSD)。第一存储器还可以包括上述种类的组合。第一存储器200与第二存储器300类似,具体可以参见前述内容,此处不再赘述。其中,第一存储器200与第二存储器300部署在不同的数据中心,位于不同的设备中,例如第一存储器200为本地磁盘,第二存储器300为云盘。The first memory 200 may include a volatile memory (volatile memory), such as a random access memory (random access memory, RAM). The first memory may also include non-volatile memory (NVM), such as read-only memory (ROM), flash memory, hard disk drive (HDD), or solid state drive (solid disk drive). -state drive, SSD). The first memory may also include a combination of the above types. The first memory 200 is similar to the second memory 300. For details, please refer to the foregoing content, which will not be repeated here. The first storage 200 and the second storage 300 are deployed in different data centers and located in different devices. For example, the first storage 200 is a local disk, and the second storage 300 is a cloud disk.
第一存储器200和第二存储器300用于存储数据库中的数据,以及数据库的相关文件,如日志(如物理日志和逻辑日志)、排序文件等。The first memory 200 and the second memory 300 are used to store data in the database and related files of the database, such as logs (such as physical logs and logical logs), sort files, and so on.
该第一存储器200还可以包括扩展内存和内存,扩展内存和内存用于存储数据库中读写频率较高的数据。The first storage 200 may also include an extended memory and a memory, and the extended memory and the memory are used to store data with a relatively high read and write frequency in the database.
数据处理装置100既可以是一个硬件装置,例如:服务器、终端计算设备等,也可以是一个软件装置,具体为运行在硬件计算设备上的一套软件系统。本申请实施例中并不限定数据处理装置100所部署的位置。示例性的,如图2所示,数据处理装置100可以运行在云计算设备系统(包括至少一个云计算设备,例如:服务器等,在本申请实施例中云计算设备系统为一个数据中心),也可以运行在边缘计算设备系统(包括至少一个边缘计算设备,例如:服务器、台式电脑等,本申请实施例中边缘计算设备系统为一个数据中心),也可以运行在各种终端计算设备上,例如:笔记本电脑、个人台式电脑等。The data processing device 100 may be a hardware device, such as a server, a terminal computing device, etc., or a software device, specifically a set of software systems running on a hardware computing device. The embodiment of the present application does not limit the location where the data processing device 100 is deployed. Exemplarily, as shown in FIG. 2, the data processing apparatus 100 may run in a cloud computing device system (including at least one cloud computing device, such as a server, etc., in the embodiment of the present application, the cloud computing device system is a data center), It can also run on an edge computing device system (including at least one edge computing device, such as a server, a desktop computer, etc., the edge computing device system in the embodiment of this application is a data center), or it can run on various terminal computing devices, For example: notebook computers, personal desktop computers, etc.
数据处理装置100在逻辑上也可以是由多个部分构成的装置,如数据处理装置100可以包括处理单元、确定单元以及切换单元,数据处理装置100中的各个组成部分可以 分别部署在不同的系统或服务器中。示例性的,如图3所示,装置的各部分可以分别运行在云计算设备系统、边缘计算设备系统或终端计算设备这三个环境中,也可以运行在这三个环境中的任意两个中。云计算设备系统、边缘计算设备系统和终端计算设备之间由通信通路连接,可以互相进行通信和数据传输。本申请实施例提供的针对数据库的数据存储方法由运行在三个环境(或三个环境中的任意两个)中的数据处理装置100的各组合部分配合执行。The data processing device 100 can also be a device composed of multiple parts logically. For example, the data processing device 100 can include a processing unit, a determining unit, and a switching unit. Each component of the data processing device 100 can be deployed in different systems. Or in the server. Exemplarily, as shown in Fig. 3, each part of the device can run in three environments of cloud computing equipment system, edge computing equipment system or terminal computing equipment respectively, and can also run in any two of these three environments. middle. The cloud computing equipment system, the edge computing equipment system and the terminal computing equipment are connected by a communication path, which can communicate and transmit data with each other. The data storage method for the database provided by the embodiment of the present application is executed by the combined parts of the data processing apparatus 100 running in three environments (or any two of the three environments).
第一存储器200可以与数据处理装置100部署在同一系统或位于同一硬件装置中,以便数据处理装置100能够较为高效的从第一存储器200中读取数据,本申请并不限定第二存储器300的部署位置,只需保证第一存储器200和第二存储器300部署在不同的数据中心即可。The first memory 200 can be deployed in the same system or in the same hardware device as the data processing device 100, so that the data processing device 100 can read data from the first memory 200 more efficiently. This application does not limit the second memory 300. The deployment location only needs to ensure that the first storage 200 and the second storage 300 are deployed in different data centers.
在本申请实施例中,数据处理装置100能够根据数据库中数据的变更状况生成日志,在第一存储器200中空闲的存储空间大于第一阈值时,优先将生成的日志保存在第一存储器200中,随着日志生成以及存储,第一存储器200中的存储空间逐渐变少,当第一磁盘中空闲的存储空间小于第一阈值后,数据处理装置100切换至第二存储器300继续存储该日志。由上可知,日志至少可以存储在两个不同的第一存储器200中,能够有效提高不同存储器的利用率,保证能够完整的保存数据库的日志。In the embodiment of the present application, the data processing device 100 can generate a log according to the change status of the data in the database. When the free storage space in the first storage 200 is greater than the first threshold, the generated log is preferentially stored in the first storage 200 As the log is generated and stored, the storage space in the first storage 200 gradually decreases. When the free storage space in the first disk is less than the first threshold, the data processing device 100 switches to the second storage 300 to continue storing the log. It can be seen from the above that the logs can be stored in at least two different first memories 200, which can effectively improve the utilization of different memories and ensure that the logs of the database can be completely saved.
下面结合附图4对本申请实施例提供的一种针对数据库的数据存储方法,该方法包括:The following is a data storage method for a database provided by an embodiment of the present application with reference to FIG. 4, and the method includes:
步骤401:数据处理装置100检测数据库的数据变更。数据库的数据变更包括但不限于:数据的插入(insert)、删除(delete)、更新(update)。其中,数据插入是指在数据库中增加新的数据,数据删除是指删除数据库中的数据,数据更新是指将数据库中某一个数据变更为另一个数据。Step 401: The data processing device 100 detects the data change of the database. Data changes in the database include but are not limited to: insert, delete, and update of data. Among them, data insertion refers to adding new data to the database, data deletion refers to deleting data in the database, and data updating refers to changing one data in the database to another data.
步骤402:数据处理装置100根据数据库的数据变更生成第一日志。Step 402: The data processing device 100 generates a first log according to the data change of the database.
在数据库中,日志用于记录仪数据库所发送的数据变更,如数据插入、删除以及更新等。当数据库中的数据发生变更时,数据处理装置100在检查到该数据变更后,会生成对应的日志,在本申请实施例中为与后续数据处理装置100生成的日志进行区分,在步骤402中生成的日志称为第一日志。In the database, the log is used to record the data changes sent by the database of the recorder, such as data insertion, deletion, and update. When the data in the database changes, the data processing device 100 will generate a corresponding log after checking the data change. In this embodiment of the application, it is distinguished from the log generated by the subsequent data processing device 100. In step 402 The generated log is called the first log.
步骤403:在第一存储器200中空闲的存储空间不小于第一阈值的情况下,数据处理装置100将第一日志存储在第一存储器200中。Step 403: When the free storage space in the first memory 200 is not less than the first threshold, the data processing apparatus 100 stores the first log in the first memory 200.
步骤404:数据处理装置100在确定第一存储器200中空闲的存储空间小于第一阈值时,切换至第二存储器300继续存储第一日志。Step 404: When the data processing device 100 determines that the free storage space in the first storage 200 is less than the first threshold, it switches to the second storage 300 to continue storing the first log.
在本申请实施例中,日志的存储顺序为第一存储器200->第二存储器300。也就是说,数据处理装置100优先将第一日志保存在第一存储器200,在第一存储器200中空闲的存储空间不足(如小于第一阈值)时,再切换到第二存储器300继续存储第一日志。In the embodiment of the present application, the storage order of the logs is the first storage 200 -> the second storage 300. In other words, the data processing device 100 preferentially saves the first log in the first memory 200, and when the free storage space in the first memory 200 is insufficient (for example, less than the first threshold), it switches to the second memory 300 to continue storing the first log. One log.
在数据库中,配置有固定的文件用于存储日志,该文件可以称为日志文件,该日志文件的大小是固定的,通常为预设值,随着数据库的数据变更,数据处理装置100生成的第一日志的大小达到预设值时,数据库将无法继续进行数据变更。此时需要对第一日志进行备份,在备份完成后可以清除第一日志,也就将第一存储器200和第二存储器300中存储的第一日志删除。In the database, a fixed file is configured for storing logs. The file can be called a log file. The size of the log file is fixed, usually a preset value. As the data in the database changes, the data processing device 100 generates When the size of the first log reaches the preset value, the database cannot continue to make data changes. At this time, the first log needs to be backed up. After the backup is completed, the first log can be cleared, which means that the first log stored in the first storage 200 and the second storage 300 are deleted.
在清除了第一日志后,若数据库中仍存在数据变更,数据处理装置100可以继续生成日志,为方便说明,这里将数据处理装置100生成的日志称为第二日志,与第一日志 类似,数据处理装置100优先将第二日志保存在第一存储器200中,在第一存储器200的空闲存储空间不足时,在切换至第二存储器300继续存储第二存储器300。After the first log is cleared, if there are still data changes in the database, the data processing device 100 can continue to generate a log. For convenience of description, the log generated by the data processing device 100 is referred to herein as the second log, which is similar to the first log. The data processing device 100 preferentially saves the second log in the first memory 200, and when the free storage space of the first memory 200 is insufficient, it switches to the second memory 300 to continue to store the second memory 300.
在前述内容中介绍了日志在第一存储器200以及第二存储器300的存储方式,数据处理装置100也可以对数据库的其他数据(如排序文件)采用类似的存储方式。In the foregoing content, the storage method of the log in the first storage 200 and the second storage 300 is introduced. The data processing device 100 may also use a similar storage method for other data in the database (such as sorted files).
数据处理装置100在接收到数据排序请求时,可以根据接收到的请求对数据库中的数据进行排序,以反馈排序后的数据。When the data processing device 100 receives the data sorting request, it may sort the data in the database according to the received request to feed back the sorted data.
在数据处理装置100在对数据库中的数据进行排序时,会产生排序文件,该排序文件记录了排序后数据的先后顺序。排序文件还可以记录数据的存储位置、以及数据的索引号等。When the data processing device 100 sorts the data in the database, a sorting file is generated, and the sorting file records the sequence of the sorted data. The sort file can also record the storage location of the data and the index number of the data.
当数据处理装置100生成排序文件后,可以将该排序文件优先存储在第一存储器200中,随着排序文件在第一存储器200的存储,第一存储器200中空闲的存储空间越来越少,当第一存储器200中的空闲的存储空间不足(如小于第二阈值)时,再切换到第二存储器300继续存储排序文件。After the data processing device 100 generates the sorted file, the sorted file can be stored in the first storage 200 first. As the sorted file is stored in the first storage 200, the free storage space in the first storage 200 becomes less and less. When the free storage space in the first storage 200 is insufficient (for example, less than the second threshold), it is switched to the second storage 300 to continue storing the sorted files.
当数据处理装置100对数据库中的数据排序完成,并根据所述排序文件确定排序结果后,可以删除该排序文件。After the data processing device 100 finishes sorting the data in the database and determines the sorting result according to the sorting file, the sorting file can be deleted.
也就是说,排序文件的存储顺序为第一存储器200->第二存储器300,优先存储在第一存储器200中,其次为第二存储器300。That is to say, the storage order of the sorted files is the first memory 200 -> the second memory 300, and the first memory 200 is first stored in the first memory 200, and the second memory 300 is second.
在上述说明中以数据处理装置100接收的请求为数据排序请求为例,当数据处理装置100杂技接收到数据查询请求时,为了能够较为快速到查询到数据处理装置100也可以对数据库中的数据进行排序,生成排序文件,排序文件的存储方式与前述方式类似,此处不再赘述。In the above description, taking the request received by the data processing device 100 as a data sorting request as an example, when the data processing device 100 acrobatics receives a data query request, in order to be able to query the data processing device 100 more quickly, it can also query the data in the database. The sorting is performed to generate a sorting file. The storage method of the sorting file is similar to the foregoing method, and will not be repeated here.
从上述内容可知,数据处理装置100能够对日志以及排序文件存储在第一存储器200和第二存储器300中,采用这种数据存储方式数据存储更加灵活,有效的扩展了数据的存储空间。It can be seen from the foregoing that the data processing device 100 can store logs and sorted files in the first memory 200 and the second memory 300, and this data storage method is more flexible for data storage and effectively expands the data storage space.
作为一种可能的实施方式,第一存储器200可以为数据库的本地磁盘,第二存储器300部署在云计算设备系统中,也即第二存储器300为云盘,数据库的相关文件或数据可以扩展到云端存储,数据库所涉及的文件可以分布在本地磁盘以及云盘中,这种存储方式兼具了云盘的扩展性,也能够有效节约维护成本。As a possible implementation manner, the first storage 200 may be a local disk of the database, and the second storage 300 is deployed in a cloud computing device system, that is, the second storage 300 is a cloud disk, and related files or data of the database may be extended to Cloud storage, the files involved in the database can be distributed on local disks and cloud disks. This storage method combines the scalability of cloud disks and can also effectively save maintenance costs.
需要说明的是,本申请实施例仅是以系统中包括两个存储器为例,在一些场景中,还可以包括数目更多的存储器,数据存储的位置更加灵活,也可以有效提升存储器的利用率。It should be noted that the embodiment of the present application only takes two memories included in the system as an example. In some scenarios, a larger number of memories may be included, and the location of data storage is more flexible, and the utilization rate of the memory can also be effectively improved. .
除了前述内容提及的数据存储方式,本申请实施例中还可以扩展部署有数据库的设备中的内存(buffer pool),例如第一存储器中设置扩展内存(extend buffer pool)。In addition to the data storage method mentioned in the foregoing, the embodiment of the present application can also expand the memory (buffer pool) in the device where the database is deployed, for example, an extended memory (extend buffer pool) is set in the first memory.
为了进一步突出扩展内存的作用,先对内存进行说明,数据库中的数据的读写频率通常是不同的,现有数据处理装置100在管理数据库中的数据时,内存中可以放置数据库中读写频率较高的数据。通常内存中数据读写频率较高,数据处理装置100可以优先从内存获取数据,这样能够有效的提高数据的读写数据,进一步提升数据处理装置100对数据的处理效率。In order to further highlight the role of memory expansion, the memory is explained first. The read and write frequency of data in the database is usually different. When the existing data processing device 100 manages the data in the database, the memory can place the read and write frequency in the database. Higher data. Generally, the frequency of reading and writing data in the memory is relatively high, and the data processing device 100 can obtain data from the memory first, which can effectively improve the read and write data of the data, and further improve the processing efficiency of the data processing device 100.
但是,通常内存的空间有限,并不能将所有读写频率较高的数据都存储在内存中,内存中存储的数据也可能发生更新,将淘汰一些读写频率较高的数据,导致数据处理装 置100需要从第二存储器300加载这些数据,为此,可以增设扩展内存。However, the memory space is usually limited, and it is not possible to store all data with higher read and write frequency in the memory. The data stored in the memory may also be updated, which will eliminate some data with higher read and write frequency, resulting in data processing devices. 100 needs to load these data from the second memory 300, and for this purpose, an extended memory can be added.
扩展内存能够存储数据读写频率大于第三阈值的数据,为了使得扩展内存存储数据读写频率大于第三阈值的数据。数据处理装置100可以周期性更新扩展内存中的数据,将数据读写频率大于第三阈值的数据存储在扩展内存中,还可以淘汰扩展内存中存储的读写频率低于第三阈值的数据。The extended memory can store data whose data read/write frequency is greater than the third threshold, in order to make the extended memory store data whose data read/write frequency is greater than the third threshold. The data processing device 100 can periodically update the data in the extended memory, store data with a data read and write frequency greater than the third threshold in the extended memory, and can also eliminate data stored in the extended memory with a read and write frequency lower than the third threshold.
下面分别对数据迁入扩展内存以及扩展内存中的数据淘汰的方式进行说明:The following is an explanation of how data is moved into the extended memory and the data in the extended memory is eliminated:
(1)、数据迁入扩展内存。(1). The data is moved into the extended memory.
数据处理装置100在向扩展内存中迁入数据时,可以先定位待迁入的数据。数据处理装置100可以将内存中的数据作为待迁入的数据。When the data processing device 100 migrates data into the extended memory, it can first locate the data to be migrated. The data processing device 100 may use the data in the internal memory as the data to be migrated in.
通常数据存储在存储器时,通常是以固定大小的数据单元来组织数据库中的数据,如数据页(page)。每个数据页(page)中存储的数据大小相同,每个数据页配置有标识(identity,ID)。数据处理装置100通过内存中各个数据页的标识可以实现数据页的快速定位。Generally, when data is stored in a memory, the data in the database is usually organized in fixed-size data units, such as data pages. The size of data stored in each data page (page) is the same, and each data page is configured with an identity (ID). The data processing device 100 can realize rapid positioning of the data page through the identification of each data page in the memory.
作为一种可能的实施方式,在第一存储器200和第二存储器300中可以利用哈希表(hash map)来组织数据页与数据页的标识,数据处理装置100可以基于哈希表,根据数据页的标识定位到数据页。As a possible implementation manner, in the first memory 200 and the second memory 300, a hash map (hash map) may be used to organize the identification of the data page and the data page, and the data processing apparatus 100 may be based on the hash table and according to the data The identification of the page locates to the data page.
(2)、数据迁出扩展内存。(2) Data is moved out of the extended memory.
数据处理装置100在从扩展内存中淘汰数据时,可以先定位待迁出的数据。如数据处理装置100可以利用页面置换算法,如最近最少使用(least recently used,LRU),确定扩展内存找那个较少使用的数据页,该数据页中的数据即为待迁出的数据。When the data processing device 100 eliminates data from the extended memory, it can first locate the data to be moved out. For example, the data processing apparatus 100 can use a page replacement algorithm, such as least recently used (LRU), to determine which data page that the extended memory finds is less used, and the data in the data page is the data to be migrated out.
当数据处理装置100需要读取数据库中的数据时,可以优先从内存或扩展内存中获取该数据,扩展内存进一步扩展了内存所能存储的数据的数据量,保证更多的读写频率较高的数据可以存储在第一存储器中,能够提高数据读写效率。When the data processing device 100 needs to read data in the database, it can first obtain the data from the memory or the extended memory. The extended memory further expands the amount of data that the memory can store, ensuring more read and write frequencies. The data of can be stored in the first memory, which can improve the efficiency of data reading and writing.
作为一种可能的实施方式,数据处理装置100还可以记录扩展内存中的数据页与数据页的标识之间的对应关系,并将该对应关系保存非易失性内存中,当第一设备重启时,可以直接调用之前保存的对应关系,重新组织扩展内存,以便可以快速的定位到扩展内存中存储的数据页。As a possible implementation manner, the data processing apparatus 100 may also record the corresponding relationship between the data page in the extended memory and the identifier of the data page, and store the corresponding relationship in the non-volatile memory. When the first device restarts At the time, you can directly call the previously saved correspondence to reorganize the extended memory so that you can quickly locate the data page stored in the extended memory.
基于与方法实施例同一发明构思,本申请实施例还提供了一种数据处理装置,该数据处理装置用于执行上述如图4所示的方法实施例中数据处理装置100执行的方法。如图5所示,数据处理装置500包括处理单元501、确定单元502以及切换单元503,前述模块可以为软件模块。具体地,在数据处理装置500中,各单元之间通过通信通路建立连接。Based on the same inventive concept as the method embodiment, an embodiment of the present application also provides a data processing device, which is configured to execute the method performed by the data processing device 100 in the method embodiment shown in FIG. 4. As shown in FIG. 5, the data processing device 500 includes a processing unit 501, a determining unit 502, and a switching unit 503, and the aforementioned modules may be software modules. Specifically, in the data processing device 500, a connection is established between units through a communication path.
处理单元501,用于根据数据库的数据变更在第一存储器中记录数据库的第一日志。处理单元501可以执行如图4所示的步骤401~403。The processing unit 501 is configured to record the first log of the database in the first memory according to the data change of the database. The processing unit 501 may execute steps 401 to 403 as shown in FIG. 4.
确定单元502,用于确定第一存储器的存储空间小于第一阈值。确定单元502可以执行如图4所示的步骤404中确定第一存储器的存储空间小于第一阈值的方法。The determining unit 502 is configured to determine that the storage space of the first memory is less than a first threshold. The determining unit 502 may execute the method of determining that the storage space of the first memory is less than the first threshold in step 404 shown in FIG. 4.
切换单元503,用于在确定单元502确定第一存储器的存储空间小于第一阈值后,使处理单元501切换至第二存储器记录数据库的第一日志,其中,第一存储器和第二存储器部署与不同的设备中。切换单元503可以执行如图4所示的步骤404中确切换至第二存储器记录第一日志的方法。The switching unit 503 is configured to switch the processing unit 501 to the first log of the second memory recording database after the determining unit 502 determines that the storage space of the first memory is less than the first threshold, where the first memory and the second memory are deployed and In different devices. The switching unit 503 may execute the method of exactly switching to the second memory to record the first log in step 404 as shown in FIG. 4.
在一种可能的实施方式中,装置还包括排序单元504。In a possible implementation manner, the device further includes a sorting unit 504.
排序单元504在接收到数据库排序请求后,可以根据接收到的数据库排序请求,对数据库中的数据进行排序;处理单元501将排序过程中生成的排序文件保存在第一存储器中;After receiving the database sorting request, the sorting unit 504 can sort the data in the database according to the received database sorting request; the processing unit 501 saves the sorting file generated during the sorting process in the first memory;
确定单元502在确定第一存储器的存储空间小于第二阈值后,切换单元503可以使处理单元501切换至第二存储器保存第二存储器。After the determining unit 502 determines that the storage space of the first memory is smaller than the second threshold, the switching unit 503 may cause the processing unit 501 to switch to the second memory to save the second memory.
在一种可能的实施方式中,第一存储器包括扩展内存和内存;In a possible implementation manner, the first memory includes extended memory and memory;
处理单元501可以数据库中数据的读写频率,将数据库的数据中读写频率大于第三阈值的数据保存在扩展内存中,将扩展内存中的数据读取至内存。The processing unit 501 can store the data in the database with a read and write frequency greater than the third threshold in the read and write frequency of the data in the database, and read the data in the extended memory to the memory.
在一种可能的实施方式中,处理单元501还可以淘汰扩展内存中读写频率低于第三阈值的数据。In a possible implementation manner, the processing unit 501 may also eliminate data whose read/write frequency is lower than the third threshold in the extended memory.
在一种可能的实施方式中,处理单元501切换至第二存储器记录数据库的第一日志之后,还可以在对第一日志备份完成后,清除第一日志;根据数据库的数据变更在第一存储器中记录数据库的第二日志。In a possible implementation manner, after the processing unit 501 switches to the second memory to record the first log of the database, it may also clear the first log after the first log is backed up; according to the data change of the database, the first log is changed in the first memory. Record the second log in the database.
在一种可能的实施方式中,处理单元501将排序过程中生成第二排序文件保存在第二存储器中之后,可以在排序单元504根据排序文件对数据库中的数据排序完成后,删除排序文件。In a possible implementation manner, after the processing unit 501 saves the second sorting file generated during the sorting process in the second memory, the sorting file may be deleted after the sorting unit 504 finishes sorting the data in the database according to the sorting file.
在一种可能的实施方式中,处理单元501还可以记录扩展内存中数据页与数据标识之间的对应关系,在第一存储器所在设备重启后,根据对应关系重组扩展内存。In a possible implementation manner, the processing unit 501 may also record the corresponding relationship between the data page and the data identifier in the extended memory, and reorganize the extended memory according to the corresponding relationship after the device where the first memory is located is restarted.
本申请实施例中对模块的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,另外,在本申请各个实施例中的各功能模块可以集成在一个处理器中,也可以是单独物理存在,也可以两个或两个以上模块集成为一个模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。The division of modules in the embodiments of this application is illustrative, and it is only a logical function division. In actual implementation, there may be other division methods. In addition, the functional modules in the various embodiments of this application can be integrated into one process. In the device, it can also exist alone physically, or two or more modules can be integrated into one module. The above-mentioned integrated modules can be implemented in the form of hardware or software functional modules.
该集成的模块如果以软件功能模块的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台终端设备(可以是个人计算机,手机,或者网络设备等)或处理器(processor)执行本申请各个实施例该方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。If the integrated module is implemented in the form of a software function module and sold or used as an independent product, it can be stored in a computer readable storage medium. Based on this understanding, the technical solution of the present application essentially or the part that contributes to the existing technology or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , Including several instructions to enable a terminal device (which may be a personal computer, a mobile phone, or a network device, etc.) or a processor to execute all or part of the steps of the method in each embodiment of the present application. The aforementioned storage media include: U disk, mobile hard disk, read-only memory (read-only memory, ROM), random access memory (random access memory, RAM), magnetic disks or optical disks and other media that can store program codes. .
如图6所示的计算设备600。所述计算设备600包括总线601、处理器602、通信接口603和存储器604。处理器602、存储器604和通信接口603之间通过总线601通信。The computing device 600 is shown in FIG. 6. The computing device 600 includes a bus 601, a processor 602, a communication interface 603, and a memory 604. The processor 602, the memory 604, and the communication interface 603 communicate through a bus 601.
其中,处理器602可以为中央处理器(central processing unit,CPU)。存储器604可以包括易失性存储器(volatile memory),例如随机存取存储器(random access memory,RAM)。存储器604还可以包括非易失性存储器(non-volatile memory),例如只读存储器(read-only memory,ROM),快闪存储器,HDD或SSD。存储器中存储有可执行代码,处理器602执行该可执行代码以执行前述数据存储方法(如图4所示的方法)。存储器604中还可以包括操作系统等其他运行进程所需的软件模块。操作系统可以为LINUX TM,UNIX TM,WINDOWS TM等。 The processor 602 may be a central processing unit (CPU). The memory 604 may include a volatile memory (volatile memory), such as a random access memory (random access memory, RAM). The memory 604 may also include a non-volatile memory (non-volatile memory), such as read-only memory (ROM), flash memory, HDD or SSD. Executable code is stored in the memory, and the processor 602 executes the executable code to execute the aforementioned data storage method (the method shown in FIG. 4). The memory 604 may also include an operating system and other software modules required for running processes. The operating system can be LINUX TM , UNIX TM , WINDOWS TM etc.
具体的,存储器604中存储有前述数据处理装置500中的模块。存储器604中除了 存储前述多个模块,还可以包括操作系统等其他运行进程所需的软件模块。操作系统可以为LINUX TM,UNIX TM,WINDOWS TM等。 Specifically, the memory 604 stores the modules in the aforementioned data processing apparatus 500. In addition to storing the aforementioned multiple modules, the memory 604 may also include other software modules required for running processes such as an operating system. The operating system can be LINUX TM , UNIX TM , WINDOWS TM etc.
本申请还提供一种计算设备系统,所述计算设备系统包括至少一个如图7所示的计算设备700。所述计算设备700包括总线701、处理器702、通信接口703和存储器704。处理器702、存储器704和通信接口703之间通过总线701通信。所述计算设备系统中的至少一个计算设备700之间通过通信通路进行通信。The present application also provides a computing device system. The computing device system includes at least one computing device 700 as shown in FIG. 7. The computing device 700 includes a bus 701, a processor 702, a communication interface 703, and a memory 704. The processor 702, the memory 704, and the communication interface 703 communicate through a bus 701. At least one computing device 700 in the computing device system communicates with each other through a communication path.
其中,处理器702可以为CPU。存储器704可以包括易失性存储器,例如随机存取存储器。存储器704还可以包括非易失性存储器,例如只读存储器,快闪存储器,HDD或SSD。存储器704中存储有可执行代码,处理器702执行该可执行代码以执行前述数据同步的方法中的任意部分或全部。存储器中还可以包括操作系统等其他运行进程所需的软件模块。操作系统可以为LINUX,UNIX,WINDOWS TM等。 The processor 702 may be a CPU. The memory 704 may include volatile memory, such as random access memory. The memory 704 may also include non-volatile memory, such as read-only memory, flash memory, HDD or SSD. Executable code is stored in the memory 704, and the processor 702 executes the executable code to perform any part or all of the aforementioned data synchronization method. The memory may also include an operating system and other software modules required for running processes. The operating system can be LINUX, UNIX, WINDOWS TM, etc.
具体的,存储器704中存储有前述数据处理装置500中的任意一个或任意多个模块。存储器704中除了存储前述任意一个或任意多个单元,还可以包括操作系统等其他运行进程所需的软件模块。操作系统可以为LINUX,UNIX,WINDOWS TM等。 Specifically, any one or any multiple modules of the aforementioned data processing apparatus 500 are stored in the memory 704. In addition to storing any one or more of the aforementioned units, the memory 704 may also include other software modules required for running processes, such as an operating system. The operating system can be LINUX, UNIX, WINDOWS TM, etc.
所述计算设备系统中的至少一个计算设备700之间通过通信网络互相建立通信,每个计算设备上运行数据处理装置500中的任意一个或者任意多个单元。至少一个计算设备700共同执行前述数据存储操作。At least one computing device 700 in the computing device system establishes communication with each other through a communication network, and each computing device runs any one or any multiple units in the data processing apparatus 500. At least one computing device 700 collectively performs the aforementioned data storage operation.
上述各个附图对应的流程的描述各有侧重,某个流程中没有详述的部分,可以参见其他流程的相关描述。The descriptions of the processes corresponding to each of the above figures have their respective focuses. For parts that are not described in detail in a certain process, please refer to the related descriptions of other processes.
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。数据同步的计算机程序产品包括一个或多个数据同步的计算机程序指令,在计算机上加载和执行所述数据同步的计算机程序指令时,全部或部分地产生按照本发明实施例所述的数据同步的流程或功能。In the above-mentioned embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented by software, it can be implemented in the form of a computer program product in whole or in part. The computer program product for data synchronization includes one or more computer program instructions for data synchronization. When the computer program instructions for data synchronization are loaded and executed on a computer, the data synchronization program according to the embodiment of the present invention is generated in whole or in part. Process or function.
所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质包括存储有分类模型训练的计算机程序指令的可读存储介质。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质(例如SSD)。The computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium. For example, the computer instructions may be transmitted from a website, computer, server, or data center. Transmission to another website, computer, server, or data center via wired (such as coaxial cable, optical fiber, digital subscriber line, or wireless (such as infrared, wireless, microwave, etc.)). The computer-readable storage medium includes A readable storage medium of computer program instructions for classification model training. The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server or data center integrated with one or more available media. The usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, a magnetic tape), an optical medium (for example, a DVD), or a semiconductor medium (for example, an SSD).

Claims (16)

  1. 一种针对数据库的数据存储方法,其特征在于,所述方法包括:A data storage method for a database, characterized in that the method includes:
    根据数据库的数据变更在所述第一存储器中记录所述数据库的第一日志;Recording the first log of the database in the first memory according to the data change of the database;
    在确定所述第一存储器的存储空间小于第一阈值后,切换至所述第二存储器记录所述数据库的第一日志,其中,所述第一存储器和第二存储器部署与不同的设备中。After determining that the storage space of the first memory is less than the first threshold, switch to the second memory to record the first log of the database, where the first memory and the second memory are deployed in different devices.
  2. 如权利要求1所述的方法,其特征在于,所述方法还包括:The method of claim 1, wherein the method further comprises:
    根据接收到的数据库排序请求,对所述数据库中的数据进行排序;Sorting the data in the database according to the received database sorting request;
    将排序过程中生成的排序文件保存在所述第一存储器中;Saving the sorting file generated in the sorting process in the first memory;
    在确定所述第一存储器的存储空间小于第二阈值后,切换至所述第二存储器继续保存所述第二存储器。After determining that the storage space of the first memory is smaller than the second threshold, switch to the second memory and continue to save the second memory.
  3. 如权利要求1或2所述的方法,其特征在于,所述第一存储器包括扩展内存和内存,所述方法还包括:The method according to claim 1 or 2, wherein the first memory includes an extended memory and a memory, and the method further comprises:
    基于所述数据库中数据的读写频率,将所述数据库的数据中读写频率大于第三阈值的数据保存在所述扩展内存中;Based on the read and write frequency of the data in the database, storing data in the database with a read and write frequency greater than a third threshold in the extended memory;
    将所述扩展内存中的数据读取至所述内存。Read the data in the extended memory to the memory.
  4. 如权利要求3所述的方法,其特征在于,所述方法还包括:The method of claim 3, wherein the method further comprises:
    淘汰所述扩展内存中读写频率低于第三阈值的数据。Eliminate the data whose read and write frequency is lower than the third threshold in the extended memory.
  5. 如权利要求1~4任一所述的方法,其特征在于,所述切换至所述第二存储器记录所述数据库的第一日志之后,还包括:The method according to any one of claims 1 to 4, wherein after the switching to the second memory to record the first log of the database, the method further comprises:
    在对所述第一日志备份完成后,清除所述第一日志;Clearing the first log after the backup of the first log is completed;
    根据所述数据库的数据变更在所述第一存储器中记录所述数据库的第二日志。The second log of the database is recorded in the first memory according to the data change of the database.
  6. 如权利要求1~5任一所述的方法,其特征在于,所述将排序过程中生成第二排序文件保存在所述第二存储器中之后,还包括:5. The method according to any one of claims 1 to 5, characterized in that, after storing the second sorted file generated in the sorting process in the second memory, the method further comprises:
    在根据所述排序文件对所述数据库中的数据排序完成后,删除所述排序文件。After the data in the database is sorted according to the sorting file, the sorting file is deleted.
  7. 如权利要求3或4所述的方法,其特征在于,所述方法还包括:The method according to claim 3 or 4, wherein the method further comprises:
    记录所述扩展内存中数据页与数据标识之间的对应关系;Record the correspondence between data pages and data identifiers in the extended memory;
    在所述第一存储器所在设备重启后,根据所述对应关系重组所述扩展内存。After the device where the first memory is located is restarted, the extended memory is reorganized according to the corresponding relationship.
  8. 一种数据处理装置,其特征在于,所述装置包括处理单元、确定单元以及切换单元:A data processing device, characterized in that the device includes a processing unit, a determining unit, and a switching unit:
    所述处理单元,用于根据数据库的数据变更在所述第一存储器中记录所述数据库的第一日志;The processing unit is configured to record the first log of the database in the first memory according to the data change of the database;
    所述确定单元,用于确定所述第一存储器的存储空间小于第一阈值;The determining unit is configured to determine that the storage space of the first memory is less than a first threshold;
    所述切换单元,用于在所述确定单元确定所述第一存储器的存储空间小于所述第一阈值后,使所述处理单元切换至所述第二存储器记录所述数据库的第一日志,其中,所述第一存储器和第二存储器部署与不同的设备中。The switching unit is configured to switch the processing unit to the second memory to record the first log of the database after the determining unit determines that the storage space of the first memory is less than the first threshold; Wherein, the first memory and the second memory are deployed in different devices.
  9. 如权利要求8所述的装置,其特征在于,所述装置还包括排序单元:The device according to claim 8, wherein the device further comprises a sorting unit:
    所述排序单元,用于根据接收到的数据库排序请求,对所述数据库中的数据进行排序;The sorting unit is configured to sort the data in the database according to the received database sorting request;
    所述处理单元,用于将排序过程中生成的排序文件保存在所述第一存储器中;The processing unit is configured to save the sorting file generated in the sorting process in the first memory;
    所述确定单元,还用于确定所述第一存储器的存储空间小于第二阈值;The determining unit is further configured to determine that the storage space of the first memory is less than a second threshold;
    所述切换单元,还用于在所述确定单元确定所述第一存储器的存储空间小于第二阈值后,使所述处理单元切换至所述第二存储器保存所述第二存储器。The switching unit is further configured to switch the processing unit to the second memory to save the second memory after the determining unit determines that the storage space of the first memory is less than a second threshold.
  10. 如权利要求8或9所述的装置,其特征在于,所述第一存储器包括扩展内存和内存;9. The device according to claim 8 or 9, wherein the first memory includes an extended memory and a memory;
    所述处理单元,还用于基于所述数据库中数据的读写频率,将所述数据库的数据中读写频率大于第三阈值的数据保存在所述扩展内存中,将所述扩展内存中的数据读取至所述内存。The processing unit is further configured to store data in the database with a read-write frequency greater than a third threshold in the extended memory based on the read-write frequency of the data in the database, and store data in the extended memory The data is read into the memory.
  11. 如权利要求10所述的装置,其特征在于,The device of claim 10, wherein:
    所述处理单元,还用于淘汰所述扩展内存中读写频率低于第三阈值的数据。The processing unit is also used to eliminate data whose read and write frequency is lower than a third threshold in the extended memory.
  12. 如权利要求8~11任一所述的装置,其特征在于,所述处理单元切换至所述第二存储器记录所述数据库的第一日志之后,还用于:The device according to any one of claims 8 to 11, wherein after the processing unit is switched to the second memory to record the first log of the database, it is further configured to:
    在对所述第一日志备份完成后,清除所述第一日志;Clearing the first log after the backup of the first log is completed;
    根据所述数据库的数据变更在所述第一存储器中记录所述数据库的第二日志。The second log of the database is recorded in the first memory according to the data change of the database.
  13. 如权利要求9~12任一所述的装置,其特征在于,所述处理单元将排序过程中生成第二排序文件保存在所述第二存储器中之后,还用于:The device according to any one of claims 9-12, wherein after the processing unit saves the second sorting file generated in the sorting process in the second memory, it is further used for:
    在根据所述排序文件对所述数据库中的数据排序完成后,删除所述排序文件。After the data in the database is sorted according to the sorting file, the sorting file is deleted.
  14. 如权利要求10或11所述的装置,其特征在于,所述处理单元,还用于记录所述扩展内存中数据页与数据标识之间的对应关系;The device according to claim 10 or 11, wherein the processing unit is further configured to record the correspondence between data pages and data identifiers in the extended memory;
    在所述第一存储器所在设备重启后,根据所述对应关系重组所述扩展内存。After the device where the first memory is located is restarted, the extended memory is reorganized according to the corresponding relationship.
  15. 一种计算设备,其特征在于,所述计算设备包括处理器和存储器;A computing device, characterized in that the computing device includes a processor and a memory;
    所述存储器,用于存储计算机程序指令;The memory is used to store computer program instructions;
    所述处理器调用所述存储器中的计算机程序指令执行如权利要求1至7中任一项所述的方法。The processor invokes computer program instructions in the memory to execute the method according to any one of claims 1 to 7.
  16. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质存储有计算机可执行指令,所述计算机可执行指令用于使计算机执行权利要求1至7任一项所述的方法。A computer-readable storage medium, wherein the computer-readable storage medium stores computer-executable instructions, and the computer-executable instructions are used to make a computer execute the method according to any one of claims 1 to 7.
PCT/CN2021/075501 2020-03-20 2021-02-05 Data storage method and apparatus for database WO2021184996A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010203414.XA CN113495883A (en) 2020-03-20 2020-03-20 Data storage method and device for database
CN202010203414.X 2020-03-20

Publications (1)

Publication Number Publication Date
WO2021184996A1 true WO2021184996A1 (en) 2021-09-23

Family

ID=77769685

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/075501 WO2021184996A1 (en) 2020-03-20 2021-02-05 Data storage method and apparatus for database

Country Status (2)

Country Link
CN (1) CN113495883A (en)
WO (1) WO2021184996A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115391098A (en) * 2022-08-25 2022-11-25 北京有竹居网络技术有限公司 Big data analysis method and device, edge node and cloud server
CN115658328A (en) * 2022-12-07 2023-01-31 摩尔线程智能科技(北京)有限责任公司 Device and method for managing storage space, computing equipment and chip
CN116744168A (en) * 2022-09-01 2023-09-12 荣耀终端有限公司 Log storage method and related device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080148270A1 (en) * 2006-12-15 2008-06-19 International Business Machines Corporation Method and implementation for storage provisioning planning
CN103838521A (en) * 2014-02-28 2014-06-04 华为技术有限公司 Data processing method and data processing device
CN104539669A (en) * 2014-12-17 2015-04-22 中国电子科技集团公司第十五研究所 Data synchronization method based on mobile terminal
US9465899B2 (en) * 2013-03-15 2016-10-11 Freescale Semiconductor, Inc. Method for provisioning decoupling capacitance in an integrated circuit
CN110413684A (en) * 2018-04-25 2019-11-05 武汉海康存储技术有限公司 A kind of database synchronization method, apparatus and system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080148270A1 (en) * 2006-12-15 2008-06-19 International Business Machines Corporation Method and implementation for storage provisioning planning
US9465899B2 (en) * 2013-03-15 2016-10-11 Freescale Semiconductor, Inc. Method for provisioning decoupling capacitance in an integrated circuit
CN103838521A (en) * 2014-02-28 2014-06-04 华为技术有限公司 Data processing method and data processing device
CN104539669A (en) * 2014-12-17 2015-04-22 中国电子科技集团公司第十五研究所 Data synchronization method based on mobile terminal
CN110413684A (en) * 2018-04-25 2019-11-05 武汉海康存储技术有限公司 A kind of database synchronization method, apparatus and system

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115391098A (en) * 2022-08-25 2022-11-25 北京有竹居网络技术有限公司 Big data analysis method and device, edge node and cloud server
CN116744168A (en) * 2022-09-01 2023-09-12 荣耀终端有限公司 Log storage method and related device
CN116744168B (en) * 2022-09-01 2024-05-14 荣耀终端有限公司 Log storage method and related device
CN115658328A (en) * 2022-12-07 2023-01-31 摩尔线程智能科技(北京)有限责任公司 Device and method for managing storage space, computing equipment and chip
CN115658328B (en) * 2022-12-07 2023-10-03 摩尔线程智能科技(北京)有限责任公司 Device and method for managing storage space, computing device and chip

Also Published As

Publication number Publication date
CN113495883A (en) 2021-10-12

Similar Documents

Publication Publication Date Title
WO2021184996A1 (en) Data storage method and apparatus for database
CN108319654B (en) Computing system, cold and hot data separation method and device, and computer readable storage medium
US9268711B1 (en) System and method for improving cache performance
CN108268219B (en) Method and device for processing IO (input/output) request
US10372687B1 (en) Speeding de-duplication using a temporal digest cache
US10839016B2 (en) Storing metadata in a cuckoo tree
CN108733306B (en) File merging method and device
US11513996B2 (en) Non-disruptive and efficient migration of data across cloud providers
US20150142755A1 (en) Storage apparatus and data management method
US9268693B1 (en) System and method for improving cache performance
US9367256B2 (en) Storage system having defragmentation processing function
WO2019001521A1 (en) Data storage method, storage device, client and system
CN110618789B (en) Method and device for deleting repeated data
US8572338B1 (en) Systems and methods for creating space-saving snapshots
US9268696B1 (en) System and method for improving cache performance
CN112684975B (en) Data storage method and device
US11593312B2 (en) File layer to block layer communication for selective data reduction
US11513739B2 (en) File layer to block layer communication for block organization in storage
US10614036B1 (en) Techniques for de-duplicating data storage systems using a segmented index
US10795596B1 (en) Delayed deduplication using precalculated hashes
CN113625952A (en) Object storage method, device, equipment and storage medium
KR20200078426A (en) Recovery server and computer programs
US8886883B1 (en) System and method for improving cache performance
US9208098B1 (en) System and method for improving cache performance
WO2021185059A1 (en) Data migration method and apparatus for database

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21770494

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21770494

Country of ref document: EP

Kind code of ref document: A1