WO2024066597A1 - Data storage method and apparatus - Google Patents

Data storage method and apparatus Download PDF

Info

Publication number
WO2024066597A1
WO2024066597A1 PCT/CN2023/104709 CN2023104709W WO2024066597A1 WO 2024066597 A1 WO2024066597 A1 WO 2024066597A1 CN 2023104709 W CN2023104709 W CN 2023104709W WO 2024066597 A1 WO2024066597 A1 WO 2024066597A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
index engine
data table
index
amplification
Prior art date
Application number
PCT/CN2023/104709
Other languages
French (fr)
Chinese (zh)
Inventor
王顺卓
Original Assignee
华为云计算技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为云计算技术有限公司 filed Critical 华为云计算技术有限公司
Publication of WO2024066597A1 publication Critical patent/WO2024066597A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures

Definitions

  • the present application relates to the field of data processing technology, and in particular to a data storage method and device.
  • the database is a warehouse for storing data, storing a large amount of data.
  • the data in the database needs to be organized according to a certain structure, that is, an index engine is used to store data.
  • an index engine is used to store data.
  • the commonly used index engines for databases include log structured merge tree (LSM tree) and B+ tree.
  • Different access operations to the database bring different load characteristics to the database.
  • Different index engines adapt to different load characteristics. For example, the log structure merge tree structure is more suitable for the load characteristics when the write operation frequency is high, and the B+ tree structure is more suitable for the load characteristics when the read operation frequency is high.
  • the embodiments of the present application provide a data storage method and device, which can be configured by a user or adjust the index engine of a data table according to changes in the load characteristics of the data table.
  • a data storage method is provided, which is applied to a control device in a storage system, wherein the storage system also includes a storage device, which stores a user's data table; the method includes: providing a configuration interface, the configuration interface being used for the user to configure the load characteristics of the data table as a read-intensive load or a write-intensive load; when the configuration interface indicates that the user configures the load characteristics of the data table as a read-intensive load, instructing the storage device to store the data in the data table according to a first index engine; the first index engine matches the read-intensive load; when the configuration interface indicates that the user configures the load characteristics of the data table as a write-intensive load, instructing the storage device to store the data in the data table according to a second index engine; the second index engine matches the write-intensive load.
  • users can configure the index engine of their data table.
  • users can adjust the index engine of the data table according to changes in the business served by the data table, so that the index engine matches the load characteristics caused by the changed business, thereby improving the access performance of the data table.
  • the first index engine includes at least a B+ tree structure
  • the second index engine includes at least a log structure merged with a tree structure
  • the B+ tree structure belongs to a read-friendly index engine
  • the first index engine includes the B+ tree structure, which can improve the matching degree between the first index engine and read-intensive loads.
  • the log structure merge tree structure belongs to a write-friendly index engine
  • the second index engine includes the log structure merge tree structure, which can improve the matching degree between the second index engine and write-intensive loads.
  • instructing the storage device to store the data in the data table according to the first index engine before instructing the storage device to store the data in the data table according to the first index engine, the index engine in the data table is the second index engine; instructing the storage device to store the data in the data table according to the first index engine includes: instructing the storage device to first migrate the index engine in the data table from the second index engine to the third index engine, the structure of the third index engine being between the structure of the first index engine and the structure of the second index engine; and then instructing the storage device to migrate the index engine of the data table from the third index engine to the first index engine.
  • the third index engine can be called a hybrid index engine.
  • the index engine of the data table can be switched from a write-friendly index engine to a hybrid index engine, and then from the hybrid index engine to a read-friendly index engine, thereby realizing a gradual switching of the index engine and avoiding the index engine change overhead caused by switching between index engines with large structural differences.
  • the data table has a local secondary index
  • the configuration interface is also used for the user to configure the load characteristics of the local secondary index as a read-intensive load or a write-intensive load; when the configuration interface instructs the user to configure the load characteristics of the local secondary index as a read-intensive load, the storage device is instructed to store the data under the local secondary index according to the first index engine; when the configuration interface instructs the user to configure the load characteristics of the local secondary index as a write-intensive load, the storage device is instructed to store the data under the local secondary index according to the second index engine.
  • users can configure the index engine of the local secondary index in their data table.
  • users can adjust the index engine of the local secondary index according to changes in the business served by the local secondary index, so that the index engine matches the load characteristics caused by the changed business, thereby improving the access performance of the local secondary index.
  • the data in the data table is stored in the form of key-value pairs.
  • the method can be applied to a key-value database, and can improve the access performance of the key-value database.
  • a data storage method is provided, which is applied to a control device in a storage system, wherein the storage system also includes a storage device, which stores a data table; the method includes: when the index engine of the data table is a fourth index engine, the control device monitors the operation amplification of the data table, the operation amplification including read amplification of reading data from the data table or write amplification of writing data to the data table; when the operation amplification includes read amplification and the read amplification is greater than a first threshold value, instructing the storage device to store the data in the data table according to the fifth index engine; wherein the matching degree between the fifth index engine and the read-intensive load is greater than the matching degree between the fourth index engine and the read-intensive load; when the operation amplification includes write amplification and the write amplification is greater than a second threshold value, instructing the storage device to store the data in the data table according to the sixth index engine; wherein the matching degree between the sixth index engine and the write-intensive load is greater
  • This method can monitor the read amplification or write amplification of the data table, and based on the read amplification or write amplification, determine whether the current index engine of the data table matches the load characteristics of the data table, as well as the direction of change of the index engine, so that the index engine can be adjusted in the direction of matching the load characteristics of the data table, so that the index engine matches the load characteristics of the data table, thereby improving the access performance of the data table.
  • operation amplification includes both read amplification and write amplification; when operation amplification includes read amplification and the read amplification is greater than a first threshold, instructing the storage device to store the data in the data table according to the fifth index engine, including: when the read amplification is greater than the first threshold and the write amplification is less than a third threshold, instructing the storage device to store the data in the data table according to the fifth index engine.
  • the index engine of the data table can be adjusted towards a read-friendly index engine, thereby balancing read amplification and write amplification and improving the overall access performance of the data table.
  • operation amplification includes both read amplification and write amplification; when operation amplification includes write amplification, and the write amplification is greater than a second threshold, instructing the storage device to store the data in the data table according to the sixth index engine, including: when the write amplification is greater than the second threshold and the read amplification is less than the fourth threshold, instructing the storage device to store the data in the data table according to the sixth index engine.
  • the index engine of the data table can be adjusted towards a write-friendly index engine, thereby balancing read amplification and write amplification and improving the overall access performance of the data table.
  • the fifth indexing engine includes at least a B+ tree structure
  • the sixth indexing engine includes at least a log structure merged with a tree structure.
  • the B+ tree structure belongs to a read-friendly index engine
  • the fifth index engine includes the B+ tree structure, which can improve the matching degree between the fifth index engine and read-intensive loads.
  • the log structure merge tree structure belongs to a write-friendly index engine
  • the sixth index engine includes the log structure merge tree structure, which can improve the matching degree between the sixth index engine and write-intensive loads.
  • the data table has a local secondary index LSI, and the operation amplification is the amplification generated by operating the data under the local secondary index; instructing the storage device to store the data in the data table according to the fifth index engine, including: instructing the storage device to store the data under the local secondary index according to the fifth index engine; or, instructing the storage device to store the data in the data table according to the sixth index engine, including: instructing the storage device to store the data under the local secondary index according to the sixth index engine.
  • the index engine of the local secondary index in the data table can be adjusted according to the operation amplification of the index engine of the local secondary index so that the index engine matches the load characteristics of the local secondary index, thereby improving the access performance of the local secondary index in the data table.
  • a data storage device which is configured in a control device in a storage system.
  • the storage system also includes a storage device, which stores a user's data table.
  • the data storage device includes: a providing module, which is used to provide a configuration interface, and the configuration interface is used for the user to configure the load characteristics of the data table as a read-intensive load or a write-intensive load; an indicating module, which is used to instruct the storage device to store the data in the data table according to a first index engine when the configuration interface indicates that the user configures the load characteristics of the data table as a read-intensive load; the first index engine matches the read-intensive load; the indicating module is also used to instruct the storage device to store the data in the data table according to a second index engine when the configuration interface indicates that the user configures the load characteristics of the data table as a write-intensive load; the second index engine matches the write-intensive load.
  • the first index engine includes at least a B+ tree structure
  • the second index engine includes at least a log structure merged with a tree structure
  • the instruction module before instructing the storage device to store the data in the data table according to the first index engine, the index engine in the data table is the second index engine; the instruction module is also used to: instruct the storage device to first migrate the index engine in the data table from the second index engine to the third index engine, and the structure of the third index engine is between the structure of the first index engine and the structure of the second index engine; and then instruct the storage device to migrate the index engine of the data table from the third index engine to the first index engine.
  • the data table has a local secondary index
  • the configuration interface is also used for the user to configure the load characteristics of the local secondary index as a read-intensive load or a write-intensive load
  • the indication module is also used to: when the configuration interface instructs the user to configure the load characteristics of the local secondary index as a read-intensive load, instruct the storage device to store the data under the local secondary index according to the first index engine; when the configuration interface instructs the user to configure the load characteristics of the local secondary index as a write-intensive load, instruct the storage device to store the data under the local secondary index according to the second index engine.
  • a data storage device which is configured in a control device of a storage system, wherein the storage system further comprises a storage device, a storage device
  • a storage device stores a data table
  • the data storage device includes: a monitoring module, which is used to control the device to monitor the operation amplification of the data table when the index engine of the data table is a fourth index engine, the operation amplification including read amplification of reading data from the data table or write amplification of writing data to the data table; an indication module, which is used to instruct the storage device to store the data in the data table according to a fifth index engine when the operation amplification includes read amplification and the read amplification is greater than a first threshold value; wherein the matching degree between the fifth index engine and the read-intensive load is greater than the matching degree between the fourth index engine and the read-intensive load; the indication module is also used to instruct the storage device to store the data in the data table according to a sixth index engine when the operation amplification includes write amplification and the write amplification and the write
  • the operation amplification includes both read amplification and write amplification; the indication module is used to: when the read amplification is greater than a first threshold and the write amplification is less than a third threshold, instruct the storage device to store the data in the data table according to the fifth index engine.
  • the operation amplification includes both read amplification and write amplification; the indication module is used to: when the write amplification is greater than the second threshold and the read amplification is less than the fourth threshold, instruct the storage device to store the data in the data table according to the sixth index engine.
  • the fifth indexing engine includes at least a B+ tree structure
  • the sixth indexing engine includes at least a log structure merged with a tree structure.
  • the data table has a local secondary index LSI, and the operation amplification is the amplification generated by operating the data under the local secondary index; the indication module is used to: instruct the storage device to store the data under the local secondary index according to the fifth index engine; or, instruct the storage device to store the data under the local secondary index according to the sixth index engine.
  • a computing device cluster comprising at least one computing device, each computing device comprising a processor and a memory; the processor of the at least one computing device is used to execute instructions stored in the memory of the at least one computing device, so that the computing device cluster performs the method provided in the first aspect or the method provided in the second aspect.
  • a computer program product comprising instructions is provided.
  • the computing device cluster executes the method provided in the first aspect or the method provided in the second aspect.
  • a computer-readable storage medium comprising computer program instructions.
  • the computing device cluster executes the method provided in the first aspect or the method provided in the second aspect.
  • the data storage method and device provided in the embodiments of the present application allow the user to configure the index engine of the data table, or adjust the index engine of the data table according to the load characteristics of the data table, so that the index engine of the data table matches the load characteristics of the data table, thereby improving the access performance of the data table.
  • FIG1A is a schematic diagram of the structure of a log structure merge tree
  • FIG1B is a schematic diagram of the structure of a B+ tree
  • FIG2 is a schematic diagram of the structure of a storage system provided in an embodiment of the present application.
  • FIG3 is a schematic diagram of the structure of an initial load characteristic configuration submodule provided in an embodiment of the present application.
  • FIG4 is a flow chart of a data storage solution provided in an embodiment of the present application.
  • FIG5 is a schematic diagram of an index engine provided in an embodiment of the present application.
  • FIG6 is a flow chart of a data storage solution provided in an embodiment of the present application.
  • FIG7 is a flow chart of a data storage solution provided in an embodiment of the present application.
  • FIG8 is a flow chart of a data storage method provided in an embodiment of the present application.
  • FIG9 is a flow chart of a data storage method provided in an embodiment of the present application.
  • FIG10 is a schematic diagram of the structure of a data storage device provided in an embodiment of the present application.
  • FIG11 is a schematic diagram of the structure of a data storage device provided in an embodiment of the present application.
  • FIG12 is a schematic diagram of the structure of a computing device provided in an embodiment of the present application.
  • FIG13 is a schematic diagram of the structure of a computing device cluster provided in an embodiment of the present application.
  • FIG14 is a schematic diagram of the structure of a computing device cluster provided in an embodiment of the present application.
  • FIG15 is a schematic diagram of the structure of a computing device provided in an embodiment of the present application.
  • FIG16 is a schematic diagram of the structure of a computing device cluster provided in an embodiment of the present application.
  • FIG. 17 is a schematic diagram of the structure of a computing device cluster provided in an embodiment of the present application.
  • a key-value store stores data in the form of key-value pairs, where the key is the unique identifier of the data.
  • Value refers to the data content, which can be anything from simple to complex composite objects.
  • Key-value databases store data in a simple form, provide distributed processing capabilities, and have advantages such as fast response.
  • key-value databases are non-relational databases (not only SQL, NoSQL) and can handle large-scale data storage. Therefore, key-value databases have been widely used in computer systems, especially in the field of cloud storage.
  • the key-value database can serve multiple users.
  • the user can be a tenant of cloud storage.
  • the user can query data in the database through an index engine.
  • the index engine is a data storage structure for efficiently querying data.
  • the index engine can also be called a data storage structure.
  • the data table uses a local primary index (LPI) to establish a key-to-value mapping, and uses an index engine to store the key-to-value mapping relationship to store the data in the data table.
  • LPI local primary index
  • the key is used to identify data and can be the name of the data.
  • Value represents data and can be defined according to business needs, such as a student's grades in a subject or multiple subjects.
  • Using a local primary index to establish a key-to-value mapping means storing keys and values in accordance with the local primary index data storage method.
  • a data table may be composed of one or more data partition instances.
  • a partition instance may be used to store data in a specified key range. The data in the specified key range may be referred to as data in the partition instance.
  • a partition instance may establish a mapping from key to value in the corresponding key range through a local primary index to store data in the partition instance.
  • the index engines used by the local primary indexes of the multiple data partition instances may be the same, that is, multiple data partition instances in the same data table may use the same index engine to store data. Therefore, the index engine of the local primary index of the data partition instance in the data table may be referred to as the index engine of the data table.
  • a value may include multiple pieces of information (for example, a value is an object of complex composite information).
  • a local secondary index (LSI) may be constructed to establish a mapping relationship between a certain piece of information among the multiple pieces of information and a key.
  • the piece of information may be referred to as a sub-value.
  • the local secondary index also uses an index engine to store the mapping relationship between the sub-value and the key.
  • the key is the student's student ID
  • the value includes the student's scores in multiple subjects such as language, mathematics, English, physics, and chemistry.
  • the local primary index constructs a mapping from student ID to scores in multiple subjects.
  • the local secondary index constructs a mapping from Chinese scores to student IDs, that is, the sub-value is the Chinese score. It can be set that the user needs to query the scores of each subject for students whose Chinese scores are greater than 90 points.
  • the query efficiency can be improved through the local secondary index.
  • one or more local secondary indexes can be constructed.
  • the corresponding sub-values in different local secondary indexes can be different or partially the same.
  • the local primary index may be referred to as the primary index
  • the local secondary index may be referred to as the secondary index.
  • indexes when no special distinction is made between the local primary index and the local secondary index, they may be referred to as indexes.
  • Access operations to a database may include read operations and write operations, and accordingly, load characteristics may include read-intensive load, write-intensive load, and mixed load.
  • Mixed load is a load characteristic between read-intensive load and write-intensive load, which indicates that the frequency of read operations and write operations is not much different, or in other words, the read operations and write operations are relatively balanced.
  • the load characteristic is specifically a read-intensive load.
  • Threshold A1 is a preset value. Exemplarily, threshold A1 is greater than or equal to 4. In one example, threshold A1 is 9.
  • the load characteristic is a write-intensive load.
  • Threshold A2 is a preset value. Exemplarily, threshold A2 is greater than or equal to 4. In one example, threshold A2 is 9.
  • threshold A2 is 9.
  • the load characteristic is a mixed load.
  • the access operation to the index that is, the access operation performed in the index engine used by the index, will bring load characteristics to the index.
  • Different access operations cause the index to carry different load characteristics.
  • the load characteristics brought by the access operation to the local primary index are called the primary index load characteristics
  • the load characteristics brought by the access operation to the local secondary index are called the secondary index load characteristics.
  • the load characteristics of a data table depend on the type of business it serves and the type of user behavior (such as user access to the data table and querying data in the data table using local secondary indexes).
  • peak hours such as 8pm to 10pm
  • data tables are read and written frequently, and the load characteristics at this time are mixed loads.
  • low business peak hours such as 0am to 6am
  • users dump and back up the data in the data table, and the data in the data table is read frequently. At this time, the data table is read-intensive.
  • their load characteristics are usually write-intensive.
  • Local secondary indexes provide fast access to data in a data table. Different local secondary indexes may record different sub-values in a data table. Therefore, the data in the data table can be queried through different sub-values. Therefore, the load characteristics of different local secondary indexes in a data table depend on the user's query behavior. For example, for a data table containing two local secondary indexes, the local secondary indexes are recorded as LSI 1 and LSI 2 respectively. In a certain period of time, if the user only accesses the data table with the sub-value recorded in LSI 1 as the query condition, LSI 1 is a mixed load and LSI 2 is a write-intensive load. Conversely, LSI 1 is a write-intensive load and LSI 2 is a mixed load.
  • both the load characteristics of the primary index and the load characteristics of the secondary index usually change dynamically.
  • index engines are suitable for read operations, which makes the read magnification small, the read latency low, and the read operation experience good. This type of index engine can be called a read-friendly index engine.
  • Some index engines are suitable for write operations, which makes the write magnification small, the write latency low, and the write operation experience good. This type of index engine can be called a write-friendly index engine.
  • the read-friendly index engine uses append to write data, which has fast writing speed, low writing latency and high writing performance.
  • the append writing method delays the update of data in the data table, which results in multiple historical versions of key values being retained in the data table, increasing read amplification and causing high read latency.
  • the log structured merge tree (LSM tree) structure is a typical write-friendly index engine.
  • the LSM tree stores data in the form of logs.
  • the LSM tree includes a data area and an index area (manifest).
  • the data area is the area in the LSM tree where data is stored.
  • the data area can be located on a hard disk to achieve persistent storage of data.
  • the data area includes multiple storage layers from top to bottom (C1 layer, C2 layer, C3 layer, C4 layer, C5 layer, and C6 layer as shown in FIG1A ).
  • the storage space of the upper layers in the multiple storage layers is smaller, and the storage space of the lower layers is larger.
  • the data When writing data, the data is first written to the top layer, that is, the C1 layer.
  • the amount of data in the C1 layer reaches the preset value D1
  • the data of the C1 layer and the next layer of C1 that is, the C2 layer
  • the merged data is transferred to the C2 layer.
  • the index area can be used to accelerate the location of which storage layer a certain key value is in.
  • the read-friendly index engine uses a timely update method to store data.
  • a read-friendly index engine only a single key value is retained. Therefore, the read-friendly index engine is conducive to data reading and is more suitable for read operations.
  • the B+ tree structure is a typical read-friendly index engine.
  • the data area consists of leaf nodes, and key values are stored in the leaf nodes in an orderly manner.
  • the B+ tree structure has an index area, which is used to accelerate the location of which leaf node a certain key value is in.
  • the present application provides a data storage solution that can provide an index engine configuration interface so that users can configure the index engine of the data table at any time. Then, the data in the data table can be stored according to the data storage structure configured by the user.
  • the load characteristics of the data table are related to the business served by the data table. For example, for a data table serving a data backup business, its load characteristics are usually write-intensive loads. Among them, when using a data table serving a data backup business to restore data for a data table with lost data, a large amount of data needs to be read from the data table serving the data backup business, and the load characteristics of the data table serving the data backup business are converted to read-intensive loads.
  • the user can set or predict what kind of business the data table serves at what time. In this way, the user can configure an index engine that matches the load characteristics brought by the switched business when or before the data table switches the business it serves, so that the data table can provide better access performance and reduce read or write delays.
  • the present application embodiment provides a storage system 100 that can be used to implement a data storage solution.
  • the storage system 100 includes a control device 110 and a storage device 120 .
  • the storage device 120 is a device or equipment for persistently storing data.
  • the storage device 120 may be a hard disk, such as a solid state disk (SSD).
  • the storage device 120 may be other forms of devices or equipment with a persistent data storage function.
  • the embodiments of the present application do not specifically limit the specific implementation form of the storage device 120.
  • the storage device 120 may include multiple data tables such as a data table T1 and a data table T2.
  • the data table T1 and the data table T2 may belong to a database.
  • the database may specifically be a key-value database.
  • some of the multiple data tables belong to one user, and some of the data tables may belong to different users.
  • a data table, such as the data table T1 may include a local primary index.
  • the data table T1 may also include a local secondary index B1 and a local secondary index B2.
  • the control module 110 is a module or component with data processing capabilities.
  • the control module 110 may be a physical device, such as a server or a processor.
  • the control module 110 may be a virtual device, such as a virtual machine (VM) or a container.
  • VM virtual machine
  • the embodiment of the present application does not specifically limit the specific implementation form of the control module 110.
  • the control module 110 is used to control or adjust the index engine of the data table in the storage device 120.
  • the control module 110 may include a configuration module 111 and a processing module 112.
  • the configuration module 111 may be used for the user to configure the load characteristics of the data table.
  • the processing module 112 may configure an index engine that matches the load characteristics based on the load characteristics configured by the user, and instruct the storage device 120 to store data according to the index engine. More specifically, the configuration module 111 may provide a configuration interface to the user so that the user can input the load characteristics through the configuration interface.
  • the processing module 112 may obtain the load characteristics input by the user from the configuration module 111, and configure an index engine that matches the load characteristics, thereby instructing the storage device 120 to store the data in the user's data table according to the index engine.
  • the configuration module 111 may include an initial load characteristic configuration submodule 111A
  • the processing module 112 may include an index engine initialization submodule 112A.
  • the initial load characteristic configuration submodule 111A may provide an initial load characteristic configuration interface to the user when the user creates a data table in the database 121. The user may input the initial load characteristic through the configuration interface.
  • the index engine initialization submodule 112A may configure an index engine that matches the initial load characteristic based on the initial load characteristic. Afterwards, the index engine initialization submodule 112A may instruct the storage device 120 to use the configured index engine as the initialization index engine for the data table newly created by the user.
  • the initial load characteristic configuration submodule 111A may include a main index initial load characteristic configuration submodule 111A1.
  • the main index initial load characteristic configuration submodule 111A1 may provide a main index initial load characteristic configuration interface to the user when the user creates a data table in the database 121. The user may input the main index initial load characteristic through the configuration interface.
  • the index engine initialization submodule 112A may configure an index engine that matches the main index initial load characteristic based on the main index initial load characteristic. Afterwards, the index engine initialization submodule 112A may instruct the storage device 120 to use the configured index engine as the initial main index of the data table newly created by the user.
  • the initial load characteristic configuration submodule 111A may include a secondary index initial load characteristic configuration submodule 111A2.
  • the secondary index initial load characteristic configuration submodule 111A2 may provide a secondary index initial load characteristic configuration interface to the user when the user creates a secondary index in the data table. The user may input the secondary index initial load characteristic through the configuration interface.
  • the index engine initialization submodule 112A may configure an index engine that matches the secondary index initial load characteristic based on the secondary index initial load characteristic. Afterwards, the index engine initialization submodule 112A may instruct the storage device 120 to use the configured index engine as the initial secondary index of the data table newly created by the user.
  • the configuration module 111B may include a load characteristic adjustment submodule 111B
  • the processing module 112 may include an index engine adjustment submodule 112B.
  • the load characteristic adjustment submodule 111B may provide a load characteristic adjustment interface to the user after the data table is created. The user may input a new load characteristic through the adjustment interface.
  • the index engine adjustment submodule 112B may configure an index engine that matches the new load characteristic based on the new load characteristic, and instruct the storage device 120 to migrate the index engine of the user data table to the index engine that matches the new load characteristic, or instruct the storage device 120 to switch the index engine of the user data table to an index engine that matches the new load characteristic.
  • migration can be understood as a gradual change.
  • a first structure, a second structure and a third structure are set, wherein the first structure and the second structure are quite different, and the third structure is between the first structure and the second structure, that is, the difference between the first structure and the third structure, and the difference between the second structure and the third structure are both smaller than the difference between the first structure and the second structure.
  • the migration of the first structure to the second structure is specifically that the first structure is first switched to the third structure, and then the third structure is switched to the second structure. Thereby, the index engine change overhead caused by switching between structures with large differences can be avoided.
  • the control module 110 further includes a load characteristic sensing module 113.
  • the load characteristic sensing module 113 can sense the load characteristic of the data table. Specifically, the load characteristic sensing module 113 can sense the operation amplification of the operation on the data table. When the operation amplification is greater than a preset threshold, it can be determined that the current index engine of the data table does not match the current load characteristic of the data table, and the index engine needs to be adjusted. And the type of the current load characteristic can be determined according to the specific type of the operation.
  • the operation may include a read operation and a write operation, and correspondingly, the operation amplification includes read amplification and write amplification.
  • the operation amplification refers to the ratio of the amount of data actually operated to the amount of data required to be operated
  • the read amplification refers to the ratio of the amount of data actually read to the amount of data required to be read
  • the write amplification refers to the ratio of the amount of data actually written to the amount of data required to be written.
  • the load characteristic perception module 113 can perceive the read amplification of the read operation on the data table. Among them, when the read amplification is greater than the preset threshold value A3, it can be determined that the structure of the current index engine of the data table does not match the current load characteristics, and the index engine of the data table needs to be adjusted in the direction of the read-friendly index engine. The load characteristic perception module 113 can perceive the write amplification of the write operation on the data table.
  • adjusting the index engine of the data table in the direction of the read-friendly index engine means making the structure of the adjusted index engine more suitable or more matching the read-intensive load than the structure of the index engine before the adjustment.
  • the threshold value A3 and the threshold value A4 may be preset based on experience or experiments. In one example, the threshold value A3 may be 20, and the threshold value A4 may be 10. In another example, the threshold value A3 may be 30, and the threshold value A4 may be 15. In yet another example, the threshold value A3 may be 40, and the threshold value A4 may be 20. The present embodiment of the application does not specifically limit the threshold value A3 and the threshold value A4.
  • the load characteristic perception module 113 can perceive the write amplification of the write operation on the data table.
  • the write amplification is greater than the preset threshold value A5, it can be determined that the current index engine of the data table does not match the current load characteristics of the data table, and the index engine of the data table needs to be adjusted in the direction of a write-friendly index engine.
  • the write amplification is greater than the preset threshold value A5 and the read amplification is less than the preset threshold value A6
  • adjusting the index engine of the data table in the direction of a write-friendly index engine means making the structure of the adjusted index engine more suitable or more matching to write-intensive loads than the structure of the index engine before the adjustment.
  • Threshold A5 and threshold A6 can be preset based on experience or experiments. In one example, threshold A5 can be 20, and threshold A6 can be 10. In another example, threshold A5 can be 30, and threshold A6 can be 15. In another example, threshold A5 can be 40, and threshold A6 can be 20. And so on. The embodiments of the present application do not specifically limit threshold A5 and threshold A6.
  • the storage system 100 introduces the storage system 100.
  • the data storage solution provided by the embodiment of the present application is introduced by example in combination with the storage system 100.
  • the data table T1 can be set to correspond to the user 200, and the data table T1 is used to store the service data of the user 200.
  • the control module 110 can execute step 401 to provide a configuration interface to the user 200.
  • the configuration interface is used for the user to input the load characteristics of the data table.
  • the load characteristics input by the user through the configuration interface can be read-intensive load, write-intensive load, or mixed load.
  • the configuration interface is used for the user to configure whether the load characteristics of the data table T1 are read-intensive load, write-intensive load, or mixed load.
  • the control module 110 can confirm that the load characteristics configured by the user are the default characteristics.
  • the default characteristic can be a mixed load.
  • the user 200 may execute step 403 to input the load characteristic E1, wherein the load characteristic E1 may specifically be a read-intensive load, a write-intensive load, or a mixed load.
  • the storage system 100 may further include a client (not shown) located at the user 200 side.
  • the control device 100 may provide a configuration interface to the client.
  • the user may input on the client to input the load characteristic E1 to the configuration interface.
  • the above only illustrates the method for the user to input the load characteristic to the configuration interface, and does not constitute a limitation. Other methods supported by the prior art may also be used to implement the user inputting the load characteristic to the configuration interface, which will not be described one by one here.
  • the control device 110 may provide a configuration interface to the user 200.
  • the configuration interface at this time is used for the user 200 to configure the initialization load characteristic. That is, the load characteristic E1 is the initialization load characteristic.
  • the control device 110 may provide a configuration interface to the user 200.
  • the configuration interface at this time is used for the user 200 to adjust the load characteristics.
  • the load characteristic E1 is a load characteristic actively adjusted by the user. It is not difficult to understand that the user can instruct the data table T1 to serve different businesses in different time periods. The load characteristics caused by different businesses are different. In this way, the user can input the load characteristics caused by the changed business through the configuration interface according to the changes in the business served by the data table T1. That is, the user can actively adjust the load characteristics.
  • the configuration interface can be used by the user to configure the load characteristics of the primary index and/or the load characteristics of the secondary index. That is, the load characteristic E1 can be the load characteristics of the primary index, the load characteristics of the secondary index, or the load characteristics of the primary index and the secondary index at the same time.
  • the configuration interface may specifically be an application programming interface (API).
  • API application programming interface
  • the configuration interface is specifically used for the user 200 to configure the initialization load characteristics of the primary index.
  • the function of the configuration interface can be InitTableStore(workloadType type).
  • the parameter workloadType type can take values from read, write, and default. When the parameter value of workloadType type is read, the load characteristic E1 is a read-intensive load. When the parameter value of workloadType type is write, the load characteristic E1 is a write-intensive load. When the parameter value of workloadType type is default, the load characteristic E1 is a mixed load.
  • the configuration interface is specifically used for user 200 to configure the initialization load characteristics of the secondary index
  • the function of the configuration interface may be InitIndexStore(workloadType type).
  • the parameter workloadType type takes values from read, write, and default.
  • the load characteristic E1 is a read-intensive load.
  • the load characteristic E1 is a write-intensive load.
  • the load characteristic E1 is a mixed load.
  • the configuration interface is used by user 200 to adjust the load characteristics
  • the function of the configuration interface may be ChangeStore(workloadType oldType, workloadType newType).
  • the parameter workloadType oldType represents the load characteristics before adjustment
  • the parameter workloadType newType represents the load characteristics after adjustment.
  • the load characteristic E1 is the load characteristic after adjustment, that is, the parameter workloadType newType represents the load characteristic E1.
  • the parameters workloadType oldType and workloadType newType can both take values from read, write, and default. As described above, when the value is read, the load characteristic is a read-intensive load. When the value is write, the load characteristic is a write-intensive load. When the value is default, the load characteristic E1 is a mixed load.
  • the control device 110 may execute step 405 to determine an index engine E11 that matches the load characteristic E1.
  • the index engine E11 is a read-friendly index engine.
  • the index engine E11 is a write-friendly index engine.
  • the index engine E11 is a mixed index engine.
  • the structure of the mixed index engine is between the structure of the read-friendly index engine and the structure of the write-friendly index engine.
  • an index engine may be constructed based on a B+ tree structure and a log structure merge tree structure.
  • the read-friendly index engine includes at least a B+ tree structure
  • the write-friendly index engine includes at least a log structure merge tree structure.
  • the hybrid index engine may include both a B+ tree structure and a log structure merge tree structure, wherein the B+ tree structure is located at the lower layer of the log structure merge tree structure.
  • FIG5 from left to right, multiple index engines are shown in sequence, wherein the read performance of the multiple index engines is enhanced from left to right, and the write performance is enhanced from right to left.
  • the leftmost index engine can be used as a read-friendly index engine
  • the rightmost index engine can be used as a write-friendly index engine
  • the middle index engine can be used as a hybrid index engine.
  • the read-friendly index engine is specifically a B+ tree structure.
  • the write-friendly index engine is composed of a B+ tree structure and a log-structured merge tree structure, wherein the B+ tree structure is located at the lower layer of the log-structured merge tree structure.
  • the hybrid index engine is also composed of a B+ tree structure and a log-structured merge tree structure, and the B+ tree structure is located at the lower layer of the log-structured merge tree structure.
  • the log-structured merge tree structure in the hybrid index engine has fewer storage layers. That is to say, the log-structured merge tree structure in the write-friendly index engine has a storage layer with a larger number of layers, and the log-structured merge tree structure in the hybrid index engine has a storage layer with a smaller number of layers.
  • the data is first written to the top layer, that is, the C1 layer.
  • the amount of data in the C1 layer reaches the preset value D1
  • the data in the C1 layer and the data in the next layer of the C1 layer that is, the C2 layer
  • the merged data is transferred to the C2 layer.
  • the amount of data in the C2 layer reaches the preset value D2
  • the data in the C2 layer and the data in the next layer of the C2 layer that is, the C3 layer
  • the merged data is transferred to the C3 layer, and so on.
  • the more storage layers there are in the log-structured merge tree structure the more historical versions of the data may be, which leads to read amplification and increases read latency. Therefore, in order to reduce read amplification and lower read latency, it is necessary to reduce the number of storage layers in the log-structured merge tree structure.
  • index engines with different read and write performances are constructed, that is, a read-friendly index engine, a write-friendly index engine, and a hybrid index engine are constructed.
  • the number of storage layers of the log-structured merge tree structure in the hybrid index engine can be reduced or increased, so that the hybrid index engine tends to be read-friendly or write-friendly.
  • the control device 110 may execute step 407 to instruct the storage device 120 to store the data in the data table T1 according to the index engine E11. Specifically, as shown in FIG. 4 , the control device 110 may execute step 4071 to send instruction information to the storage device 120, wherein the instruction information may include the identifier of the index engine E11. The storage device 120 may execute step 4072 to store the data in the data table T1 according to the index engine E11 in response to the instruction information.
  • the control device 110 instructs the storage device 120 to store the data in the entire data table T1 according to the index engine E11, or to store the data in the data partition instance corresponding to the primary index.
  • the instruction information may include the identifier of the data table T1 or the identifier of the data partition instance while including the identifier of the index engine E11.
  • the storage device 120 stores the data in the entire data table T1 or the data partition instance according to the identifier of the data table T1 or the identifier of the data partition instance according to the index engine E11, or stores the data in the data partition instance corresponding to the primary index.
  • the control 110 instructs the storage device 120 to store the data under the secondary index according to the index engine E11.
  • the instruction information may include the identifier of the secondary index while including the identifier of the index engine E11, so that the storage device 120 stores the data under the secondary index according to the index engine E11 according to the identifier of the secondary index.
  • the index engine of data table T1 is index engine E12.
  • index engine E12 is a read-friendly index engine
  • index engine E11 is a write-friendly index engine
  • index engine E12 is a write-friendly index engine.
  • the control device 110 may instruct the storage device 120 to first store the data in data table T1 according to the hybrid index engine, and then store the data in data table T1 according to index engine E11.
  • the storage device 120 may first change the index engine of data table T1 from index engine E12 to a hybrid index engine, and then change the index engine of data table T1 from a hybrid index engine to index engine E1. In this way, the index engine for storing data is gradually migrated from index engine E12 to index engine E11, which can reduce the overhead of changing the index engine.
  • the user can configure the index engine of his data table, so that when the business served by the data table changes, the user can adjust the index engine of the data table in a timely manner or at any time, so that the index engine matches the load characteristics caused by the changed business, thereby improving the access performance of the data table.
  • the embodiment of the present application further provides a data storage solution.
  • the solution is introduced by way of example.
  • the control device 110 may perform step 601 to monitor the operation amplification of the data table T1.
  • step 601 may be performed periodically.
  • the average value of the operation amplification monitored in an execution cycle may be used as the operation amplification of the execution cycle.
  • the execution cycle of step 601 may be preset. In one example, the execution cycle of step 601 may be 10 minutes. In another example, the execution cycle of step 601 may be 20 minutes. And so on.
  • the operation amplification may include read amplification of data read from the data table T1 or write amplification of data written to the data table T1.
  • the operation amplification may also include read amplification of data read from the data table T1 and/or write amplification of data written to the data table T1.
  • read amplification refers to read amplification of reading data from the data table T1
  • write amplification refers to write amplification of writing data to the data table T1.
  • control device 100 may execute step 603 to determine whether the operation amplification is greater than the threshold y1 .
  • the operation amplification includes read amplification
  • the threshold y1 includes a threshold A3.
  • operation amplification includes read amplification and write amplification
  • threshold y1 includes threshold A3 and threshold A4.
  • step 603 it can be determined whether the read amplification is greater than threshold A3, and whether the write amplification is less than threshold A4. If the read amplification is greater than threshold A3, and the write amplification is less than threshold A4, it can be determined that the current index engine of data table T1 does not match the current load characteristics of data table T1, and it is necessary to adjust the index engine of data table T1 in the direction of a read-friendly index engine.
  • the threshold values A3 and A4 may be specifically described above and will not be described in detail here.
  • the operation amplification includes write amplification
  • the threshold y1 includes a threshold A5.
  • operation amplification includes write amplification and read amplification
  • threshold y1 includes threshold A5 and threshold A6.
  • step 603 it can be determined whether the write amplification is greater than threshold A5, and whether the read amplification is less than threshold A6. If the write amplification is greater than threshold A5, and the read amplification is less than threshold A6, it can be determined that the current index engine of data table T1 does not match the current load characteristics of data table T1, and it is necessary to adjust the index engine of data table T1 in the direction of a write-friendly index engine.
  • threshold A5 and the threshold A6 can be referred to the above description, which will not be described again here.
  • control device 110 may execute step 605 to adjust the index engine of the data table T1 to the index engine E21 in the direction of reducing the operation amplification.
  • index engine E21 is an index engine that is more suitable or more conducive to read operations than the current index engine of data table T1, that is, the matching degree between index engine E21 and the read-intensive load is greater than the matching degree between the current index engine of data table T1 and the read-intensive load.
  • the read amplification of data table T1 when the data of data table T1 is stored according to index engine E21 is less than the current read amplification of data table T1.
  • the number of storage layers of the log structure merge tree in the index engine can be adjusted to make the index engine more read-friendly or write-friendly.
  • the index engine is more read-friendly.
  • an index engine with N fewer storage layers than the current index engine structure of data table T1 can be used as index engine E21.
  • the structure of index engine E21 has N fewer storage layers than the current index engine structure of data table T1.
  • the storage layer refers to the storage layer of the log structure merge tree, and N is an integer greater than or equal to 1.
  • the value of N can be preset. In one example, N is 1, 2, or 3, etc.
  • index engine E21 is an index engine that is more suitable or more conducive to write operations than the current index engine of data table T1, that is, the matching degree between index engine E21 and the write-intensive load is greater than the matching degree between the current index engine of data table T1 and the write-intensive load.
  • the write amplification of data table T1 when the data of data table T1 is stored according to index engine E21 is less than the current write amplification of data table T1.
  • the index engine can be made more read-friendly or write-friendly by adjusting the number of storage layers of the log-structured merge tree in the index engine.
  • the index engine is more write-friendly.
  • an index engine with M more storage layers than the structure of the current index engine of data table T1 can be used as index engine E21. That is, compared with the current index engine of data table T1, the structure of index engine E21 has M more storage layers than the structure of the current index engine of data table T1.
  • the storage layer refers to the storage layer of the log-structured merge tree, and M is an integer greater than or equal to 1.
  • the M values can be preset. In one example, M is 1, 2, or 3, etc.
  • the control device 110 may also execute step 607 to instruct the storage device 120 to store the data in the data table T1 according to the index engine E21.
  • control device 110 may execute step 6071 to send indication information to the storage device 120, wherein the indication information may include the identifier of the index engine E21.
  • the storage device 120 may execute step 6072 to store the data in the data table T1 according to the index engine E21 in response to the indication information.
  • the control device 110 instructs the storage device 120 to store the data in the entire data table T1 according to the index engine E21, or to store the data in the data partition instance corresponding to the primary index.
  • the instruction information may include the identifier of the data table T1 or the identifier of the data partition instance while including the identifier of the index engine E21, so that the storage device 120 may store the data in the entire data table T1 according to the identifier of the data table T1 or the identifier of the data partition instance, or to store the data in the data partition instance corresponding to the primary index according to the index engine E21.
  • the control 110 instructs the storage device 120 to store the data under the local secondary index according to the index engine E21.
  • the instruction information may include the identifier of the local secondary index as well as the identifier of the index engine E21, so that the storage device 120 stores the data under the local secondary index according to the index engine E21 according to the identifier of the local secondary index.
  • step 607 the control device 110 may execute step 601 and step 603 again.
  • the index engine for adjusting the data table T1 may be stopped.
  • steps 605 and 607 may be executed again.
  • the index engine of data table T1 can be dynamically adjusted so that the index engine of data table T1 matches the load characteristics of data table T1 as much as possible, thereby reducing operation amplification.
  • the data storage solution provided in the embodiment of the present application can sense the dynamic changes of the operation amplification of the data table, and dynamically adjust the index engine of the data table according to the dynamic changes of the operation amplification, so that the index engine matches the load characteristics of the data table, thereby improving the access performance of the data table.
  • the embodiment of the present application further provides a data storage solution.
  • the solution is introduced by way of example.
  • the control device 110 can execute step 701 to monitor the read and write operations of data table T1 of data table T1.
  • step 601 can be executed periodically.
  • the read and write operations include read operations and write operations.
  • the total number of read operations and the total number of write operations in an execution cycle can be monitored to obtain the monitoring result of the execution cycle. That is, the monitoring result includes the total number of read operations and the total number of write operations of data table T1 in the monitoring cycle.
  • the execution cycle can be preset. In one example, the execution cycle of step 701 can be 1 hour. In another example, the execution cycle of step 701 can be two hours. And so on.
  • the control device 110 may execute step 703 and obtain the load characteristic E3 according to the monitoring result.
  • the ratio of the total number of read operations to the total number of write operations is greater than the threshold value A1
  • the read-intensive load is used as the load characteristic E3.
  • the ratio of the total number of write operations to the total number of read operations is greater than the threshold value A2
  • the write-intensive load is used as the load characteristic E3.
  • control device 110 may execute step 705 to migrate to an indexing engine matching the load characteristic E3 to obtain an indexing engine E31.
  • the structure of the index engine E31 is less than the structure of the current index engine of the data table T1.
  • the storage layer refers to the storage layer of the log structure merge tree, and N is an integer greater than or equal to 1.
  • the value of N can be preset. In one example, N is 1, 2, or 3, etc.
  • the structure of the index engine E31 has M more storage layers than the structure of the current index engine of the data table T1.
  • the storage layer refers to the storage layer of the log structure merge tree, and M is an integer greater than or equal to 1.
  • the value of M can be preset. In one example, M is 1, 2, or 3, etc.
  • step 707 may instruct the storage device 120 to store the data in the data table T1 according to the index engine E31.
  • step 707 may include step 7071, sending instruction information to the storage device.
  • step 707 may also include step 7072, storing the data in the data table T1 according to the index engine E31.
  • step 407, step 4071, and step 4072 in Figure 4 please refer to the above introduction to step 407, step 4071, and step 4072 in Figure 4, which will not be repeated here.
  • the data storage solution provided in the embodiment of the present application can sense the dynamic changes in the load characteristics of the data table, and dynamically adjust the index engine of the data table according to the dynamic changes in the load characteristics, so that the index engine matches the load characteristics of the data table, thereby improving the access performance of the data table.
  • the embodiment of the present application provides a data storage method. It can be understood that the method is combined with the data storage solution described above, and the specific execution process of the relevant steps in the method can refer to the execution process of the corresponding steps in the data storage solution.
  • the method is applied to a control device in a storage system (eg, control device 110 in storage system 100), the storage system further includes a storage device (eg, storage device 120 in storage system 100), and the storage device stores a user data table.
  • a control device in a storage system eg, control device 110 in storage system 100
  • the storage system further includes a storage device (eg, storage device 120 in storage system 100)
  • the storage device stores a user data table.
  • the method includes the following steps.
  • Step 801 providing a configuration interface, wherein the configuration interface is used for the user to configure the load characteristics of the data table as read-intensive load or write-intensive load.
  • the configuration interface is used for the user to configure the load characteristics of the data table as read-intensive load or write-intensive load.
  • Step 803a when the configuration interface indicates that the user configures the load characteristic of the data table as a read-intensive load, instruct the storage device to store the data in the data table according to the first index engine; the first index engine matches the read-intensive load.
  • Step 803b when the configuration interface indicates that the user configures the load characteristic of the data table as a write-intensive load, instruct the storage device to store the data in the data table according to the second index engine; the second index engine matches the write-intensive load.
  • the first indexing engine includes at least a B+ tree structure
  • the second indexing engine includes at least a log structure merge tree structure.
  • instructing the storage device to store the data in the data table according to the first index engine before instructing the storage device to store the data in the data table according to the first index engine, the index engine in the data table is the second index engine; instructing the storage device to store the data in the data table according to the first index engine includes: instructing the storage device to first migrate the index engine in the data table from the second index engine to the third index engine, the structure of the third index engine being between the structure of the first index engine and the structure of the second index engine; and then instructing the storage device to migrate the index engine of the data table from the third index engine to the first index engine.
  • step 407 in Figure 4 please refer to the above description of step 407 in Figure 4.
  • the data table has a local secondary index
  • the configuration interface is also used for the user to configure the load characteristics of the local secondary index as a read-intensive load or a write-intensive load; when the configuration interface instructs the user to configure the load characteristics of the local secondary index as a read-intensive load, the storage device is instructed to store the data under the local secondary index according to the first index engine; when the configuration interface instructs the user to configure the load characteristics of the local secondary index as a write-intensive load, the storage device is instructed to store the data under the local secondary index according to the second index engine.
  • step 407 in Figure 4 please refer to the above description of step 407 in Figure 4.
  • users can configure the index engine of their data table, so that when the business served by the data table changes, the user can adjust the index engine of the data table in a timely manner or at any time, so that the index engine matches the load characteristics caused by the changed business, thereby improving the access performance of the data table.
  • the embodiment of the present application also provides a data storage method, which is applied to a control device in a storage system (e.g., the control device 110 in the storage system 100), wherein the storage system further includes a storage device (e.g., the storage device 120 in the storage system 100), and the storage device stores a data table.
  • a control device in a storage system e.g., the control device 110 in the storage system 100
  • the storage system further includes a storage device (e.g., the storage device 120 in the storage system 100)
  • the storage device stores a data table.
  • the method includes the following steps.
  • step 901 the control device monitors the operation amplification of the data table when the index engine of the data table is the fourth index engine, and the operation amplification includes read amplification of reading data from the data table or write amplification of writing data to the data table.
  • the operation amplification includes read amplification of reading data from the data table or write amplification of writing data to the data table.
  • Step 903a when the operation amplification includes the read amplification, and the read amplification is greater than a first threshold, instructing the storage device to
  • the fifth index engine stores the data in the data table; wherein the matching degree between the fifth index engine and the read-intensive load is greater than the matching degree between the fourth index engine and the read-intensive load.
  • steps 603 to 607 in FIG. 6 please refer to the above description of steps 603 to 607 in FIG. 6 .
  • Step 903b when the operation amplification includes the write amplification, and the write amplification is greater than the second threshold, instruct the storage device to store the data in the data table according to the sixth index engine; wherein the matching degree between the sixth index engine and the write-intensive load is greater than the matching degree between the fourth index engine and the write-intensive load.
  • the operation amplification includes both the read amplification and the write amplification; when the operation amplification includes the read amplification, and the read amplification is greater than the first threshold, instructing the storage device to store the data in the data table according to the fifth index engine, including: when the read amplification is greater than the first threshold, and the write amplification is less than the third threshold, instructing the storage device to store the data in the data table according to the fifth index engine.
  • the operation amplification includes both the read amplification and the write amplification; when the operation amplification includes the write amplification, and the write amplification is greater than the second threshold, instructing the storage device to store the data in the data table according to the sixth index engine, including: when the write amplification is greater than the second threshold, and the read amplification is less than the fourth threshold, instructing the storage device to store the data in the data table according to the sixth index engine.
  • steps 603 to 607 in Figure 6 please refer to the above description of steps 603 to 607 in Figure 6.
  • the fifth index engine includes at least a B+ tree structure
  • the sixth index engine includes at least a log structure merge tree structure.
  • the data table has a local secondary index LSI
  • the operation amplification is the amplification generated by operating the data under the local secondary index
  • the instructing the storage device to store the data in the data table according to the fifth index engine includes: instructing the storage device to store the data under the local secondary index according to the fifth index engine
  • the instructing the storage device to store the data in the data table according to the sixth index engine includes: instructing the storage device to store the data under the local secondary index according to the sixth index engine.
  • the data storage method provided in the embodiment of the present application can sense the dynamic changes of operation amplification of a data table, and dynamically adjust the index engine of the data table according to the dynamic changes of operation amplification, so that the index engine matches the load characteristics of the data table, thereby improving the access performance of the data table.
  • the embodiment of the present application provides a data storage device 1000.
  • the device 1000 can be configured in a control device in a storage system, and the storage system also includes a storage device, and the storage device stores a user's data table. As shown in FIG. 10 , the device 1000 includes:
  • a module 1010 is provided, for providing a configuration interface, wherein the configuration interface is used for the user to configure the load characteristic of the data table as a read-intensive load or a write-intensive load;
  • the instructing module 1020 is used to instruct the storage device to store the data in the data table according to the first indexing engine when the configuration interface indicates that the user configures the load characteristic of the data table as a read-intensive load; the first indexing engine matches the read-intensive load;
  • the indication module 1020 is also used to instruct the storage device to store the data in the data table according to the second index engine when the configuration interface indicates that the user configures the load characteristics of the data table as a write-intensive load; the second index engine matches the write-intensive load.
  • both the providing module 1010 and the indicating module 1020 can be implemented by software or by hardware.
  • the implementation of the providing module 1010 is introduced below by taking the providing module 1010 as an example.
  • the implementation of the indicating module 1020 can refer to the implementation of the providing module 1010.
  • the module 1010 provided may include code running on a computing instance.
  • the computing instance may include at least one of a physical host (computing device), a virtual machine, and a container. Further, the above-mentioned computing instance may be one or more.
  • the module 1010 provided may include code running on multiple hosts/virtual machines/containers. It should be noted that the multiple hosts/virtual machines/containers used to run the code may be distributed in the same region (region) or in different regions.
  • the multiple hosts/virtual machines/containers used to run the code may be distributed in the same availability zone (AZ) or in different AZs, each AZ including one data center or multiple data centers with close geographical locations. Among them, usually a region may include multiple AZs.
  • VPC virtual private cloud
  • multiple hosts/virtual machines/containers used to run the code can be distributed in the same virtual private cloud (VPC) or in multiple VPCs.
  • VPC virtual private cloud
  • a VPC is set up in a region.
  • a communication gateway needs to be set up in each VPC to achieve interconnection between VPCs through the communication gateway.
  • the providing module 1010 may include at least one computing device, such as a server, etc.
  • the providing module 1010 may also be implemented using an application-specific integrated circuit (ASIC).
  • ASIC application-specific integrated circuit
  • PLD programmable logic device
  • the PLD may be a complex programmable logical device (CPLD), a field-programmable gate array (FPGA), a generic array logic (GAL), or any combination thereof.
  • CPLD complex programmable logical device
  • FPGA field-programmable gate array
  • GAL generic array logic
  • the multiple computing devices included in the providing module 1010 may be distributed in the same region or in different regions.
  • the multiple computing devices included in the providing module 1010 may be distributed in the same AZ or in different AZs.
  • the multiple computing devices included in the providing module 1010 may be distributed in the same VPC or in multiple VPCs.
  • the multiple computing devices may be any combination of computing devices such as servers, ASICs, PLDs, CPLDs, FPGAs, and GALs.
  • the providing module 1010 can be used to execute any step in the method shown in FIG8, and the indicating module 1020 can be used to execute any step in the method shown in FIG8.
  • the steps that the providing module 1010 and the indicating module 1020 are responsible for implementing can be specified as needed, and the full functions of the data storage device 1000 are realized by implementing different steps in the method shown in FIG8 by the providing module 1010 and the indicating module 1020 respectively.
  • the embodiment of the present application also provides a data storage device 1100.
  • the device 1100 can be configured in a control device in a storage system, and the storage system also includes a storage device, and the storage device stores a data table. As shown in FIG. 11 , the device 1100 includes:
  • a monitoring module 1110 configured for the control device to monitor, when the index engine of the data table is the fourth index engine, the operation amplification of the data table, wherein the operation amplification includes a read amplification of reading data from the data table or a write amplification of writing data to the data table;
  • an indication module 1120 configured to indicate, when the operation amplification includes the read amplification and the read amplification is greater than a first threshold, the storage device to store the data in the data table according to a fifth index engine; wherein the matching degree between the fifth index engine and the read-intensive load is greater than the matching degree between the fourth index engine and the read-intensive load;
  • the indication module 1120 is also used to instruct the storage device to store the data in the data table according to the sixth index engine when the operation amplification includes the write amplification and the write amplification is greater than a second threshold; wherein the matching degree between the sixth index engine and the write-intensive load is greater than the matching degree between the fourth index engine and the write-intensive load.
  • the monitoring module 1110 and the indicating module 1120 can be implemented by software or hardware.
  • the implementation of the monitoring module 1110 and the indicating module 1120 can refer to the implementation of the providing module 1010, which is described above and will not be repeated here.
  • the monitoring module 1110 can be used to execute any step in the method shown in FIG9
  • the indicating module 1120 can be used to execute any step in the method shown in FIG9 .
  • the steps that the monitoring module 1110 and the indicating module 1120 are responsible for implementing can be specified as needed, and the monitoring module 1110 and the indicating module 1120 respectively implement different steps in the method shown in FIG9 to implement all functions of the data storage device 1100 .
  • the present application also provides a computing device 1200.
  • the computing device 1200 includes: a bus 1202, a processor 1204, a memory 1206, and a communication interface 1208.
  • the processor 1204, the memory 1206, and the communication interface 1208 communicate with each other through the bus 1202.
  • the computing device 1200 may be a server or a terminal device. It should be understood that the present application does not limit the number of processors and memories in the computing device 1200.
  • the bus 1202 may be a peripheral component interconnect (PCI) bus or an extended industry standard architecture (EISA) bus, etc.
  • the bus may be divided into an address bus, a data bus, a control bus, etc.
  • FIG. 12 is represented by only one line, but does not mean that there is only one bus or one type of bus.
  • the bus 1202 may include a path for transmitting information between various components of the computing device 1200 (e.g., the memory 1206, the processor 1204, and the communication interface 1208).
  • Processor 1204 may include any one or more of a central processing unit (CPU), a graphics processing unit (GPU), a microprocessor (MP), or a digital signal processor (DSP).
  • CPU central processing unit
  • GPU graphics processing unit
  • MP microprocessor
  • DSP digital signal processor
  • the memory 1206 may include a volatile memory (volatile memory), such as a random access memory (RAM).
  • volatile memory such as a random access memory (RAM).
  • RAM random access memory
  • non-volatile memory non-volatile memory
  • ROM read-only memory
  • flash memory flash memory
  • HDD hard disk drive
  • SSD solid state drive
  • the memory 1206 stores executable program codes, and the processor 1204 executes the executable program codes to respectively implement the functions of the providing module 1010 and the indicating module 1020, thereby implementing the method shown in Figure 8. That is, the memory 1206 stores instructions for executing the method shown in Figure 8.
  • the communication interface 1208 uses a transceiver module such as, but not limited to, a network interface card or a transceiver to implement communication between the computing device 1200 and other devices or a communication network.
  • a transceiver module such as, but not limited to, a network interface card or a transceiver to implement communication between the computing device 1200 and other devices or a communication network.
  • the present application also provides a computing device cluster.
  • the computing device cluster includes at least one computing device.
  • the computing device may be a server, such as a central server, an edge server, or a local server in a local data center.
  • the computing device cluster includes at least one computing device.
  • the computing device can also be a terminal device such as a desktop computer, a laptop computer or a smart phone.
  • the computing device cluster includes at least one computing device 1200.
  • the memory 1206 in one or more computing devices 1200 in the computing device cluster may store the same instructions for executing the method shown in Fig. 8 .
  • the memory 1206 of one or more computing devices 1200 in the computing device cluster may also respectively store some instructions for executing the method shown in Figure 8.
  • the combination of one or more computing devices 1200 may jointly execute instructions for executing the method shown in Figure 8.
  • the memory 1206 in different computing devices 1200 in the computing device cluster may store different instructions, which are respectively used to execute part of the functions of the apparatus 1000. That is, the instructions stored in the memory 1206 in different computing devices 1200 may implement the functions of one or more of the providing module 1010 and the indicating module 1020.
  • one or more computing devices in the computing device cluster can be connected via a network.
  • the network can be a wide area network or a local area network, etc.
  • FIG. 14 shows a possible implementation. As shown in FIG. 14 , two computing devices 1200A and 1200B are connected via a network. Specifically, the network is connected via a communication interface in each computing device.
  • the memory 1206 in the computing device 1200A stores instructions for executing the functions of the providing module 1010. At the same time, the memory 1206 in the computing device 1200B stores instructions for executing the functions of the indicating module 1020.
  • computing device 1200A shown in FIG14 may also be accomplished by multiple computing devices 1200.
  • functionality of the computing device 1200B may also be accomplished by multiple computing devices 1200.
  • the embodiment of the present application also provides another computing device cluster.
  • the connection relationship between the computing devices in the computing device cluster can be similar to the connection mode of the computing device cluster described in Figures 13 and 14.
  • the difference is that the memory 1206 in one or more computing devices 1200 in the computing device cluster can store the same instructions for executing the method shown in Figure 8.
  • the memory 1206 of one or more computing devices 1200 in the computing device cluster may also respectively store some instructions for executing the method shown in Figure 8.
  • the combination of one or more computing devices 1200 may jointly execute instructions for executing the method shown in Figure 8.
  • the present application also provides a computing device 1500.
  • the computing device 1500 includes: a bus 1502, a processor 1504, a memory 1506, and a communication interface 1508.
  • the processor 1504, the memory 1506, and the communication interface 1508 communicate with each other through the bus 1502.
  • the computing device 1500 can be a server or a terminal device. It should be understood that the present application does not limit the number of processors and memories in the computing device 1500.
  • the implementations of the bus 1502 , the processor 1504 , the memory 1506 , and the communication interface 1508 may refer to the implementations of the bus 1202 , the processor 1204 , the memory 1206 , and the communication interface 1208 , respectively.
  • the memory 1506 stores executable program codes, and the processor 1504 executes the executable program codes to respectively implement the functions of the aforementioned monitoring module 1110 and the indication module 1120, thereby implementing the method shown in Figure 9. That is, the memory 1506 stores instructions for executing the method shown in Figure 9.
  • the communication interface 1508 uses a transceiver module such as, but not limited to, a network interface card or a transceiver to implement communication between the computing device 1500 and other devices or communication networks.
  • a transceiver module such as, but not limited to, a network interface card or a transceiver to implement communication between the computing device 1500 and other devices or communication networks.
  • the embodiment of the present application also provides a computing device cluster.
  • the computing device cluster includes at least one computing device.
  • the computing device can be a server, such as a central server, an edge server, or a local server in a local data center.
  • the computing device can also be a terminal device such as a desktop computer, a laptop computer, or a smart phone.
  • the computing device cluster includes at least one computing device 1500.
  • the memory 1506 in one or more computing devices 1500 in the computing device cluster may store the same instructions for executing the method shown in Fig. 9.
  • the memory 1506 of one or more computing devices 1500 in the computing device cluster may also respectively store some instructions for executing the method shown in Figure 9.
  • the combination of one or more computing devices 1500 may jointly execute instructions for executing the method shown in Figure 9.
  • the memory 1506 in different computing devices 1500 in the computing device cluster may store different instructions, which are respectively used to execute part of the functions of the apparatus 1100. That is, the instructions stored in the memory 1506 in different computing devices 1500 may implement the functions of one or more modules in the monitoring module 1110 and the indication module 1120.
  • one or more computing devices in the computing device cluster can be connected via a network.
  • the network can be a wide area network or a local area network, etc.
  • Figure 17 shows a possible implementation. As shown in Figure 17, two computing devices 1500A and 1500B are connected via a network. Specifically, the network is connected via a communication interface in each computing device.
  • the memory 1506 in the computing device 1500A stores instructions for executing the functions of the monitoring module 1110.
  • the memory 1506 in the computing device 1500B stores instructions for executing the functions of the indication module 1120.
  • computing device 1500A shown in FIG. 17 may also be performed by multiple computing devices 1500.
  • the functionality of 1500B may also be performed by multiple computing devices 1500 .
  • the embodiment of the present application also provides another computing device cluster.
  • the connection relationship between the computing devices in the computing device cluster can be similar to the connection mode of the computing device cluster described in Figures 16 and 17.
  • the difference is that the memory 1506 in one or more computing devices 1500 in the computing device cluster can store the same instructions for executing the method shown in Figure 9.
  • the memory 1506 of one or more computing devices 1500 in the computing device cluster may also respectively store some instructions for executing the method shown in Figure 9.
  • the combination of one or more computing devices 1500 may jointly execute instructions for executing the method shown in Figure 9.
  • the embodiment of the present application also provides a computer program product including instructions.
  • the computer program product may be software or a program product including instructions that can be run on a computing device or stored in any available medium.
  • the at least one computing device executes the method shown in FIG8 .
  • the embodiment of the present application also provides a computer program product including instructions.
  • the computer program product may be software or a program product including instructions that can be run on a computing device or stored in any available medium.
  • the at least one computing device executes the method shown in FIG. 9 .
  • the embodiment of the present application also provides a computer-readable storage medium.
  • the computer-readable storage medium can be any available medium that can be stored by a computing device or a data storage device such as a data center containing one or more available media.
  • the available medium can be a magnetic medium (e.g., a floppy disk, a hard disk, a tape), an optical medium (e.g., a DVD), or a semiconductor medium (e.g., a solid-state hard disk).
  • the computer-readable storage medium includes instructions that instruct the computing device to execute the method shown in Figure 8.
  • the embodiment of the present application also provides a computer-readable storage medium.
  • the computer-readable storage medium can be any available medium that can be stored by a computing device or a data storage device such as a data center containing one or more available media.
  • the available medium can be a magnetic medium (e.g., a floppy disk, a hard disk, a tape), an optical medium (e.g., a DVD), or a semiconductor medium (e.g., a solid-state hard disk).
  • the computer-readable storage medium includes instructions that instruct the computing device to execute the method shown in Figure 9.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present application relates to the field of data processing, and in particular to a data storage method and apparatus. The method comprises: providing a configuration interface, the configuration interface being used for a user to configure a load characteristic of a data table as a read-intensive load or a write-intensive load; when the configuration interface indicates that the load characteristic of the data table configured by the user is the read-intensive load, instructing a storage apparatus to store data in the data table according to a first index engine, the first index engine matching the read-intensive load; and when the configuration interface indicates that the load characteristic of the data table configured by the user is the write-intensive load, instructing the storage apparatus to store the data in the data table according to a second index engine, the second index engine matching the write-intensive load. According to the method, index engines of a data table can be configured by a user, so that the index engines of the data table match the load characteristics caused by a service, thereby improving the access performance of the data table.

Description

一种数据存储方法及装置Data storage method and device
本申请要求于2022年9月29日提交中国国家知识产权局、申请号为202211202134.2、申请名称为“一种数据存储方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims priority to the Chinese patent application filed with the State Intellectual Property Office of China on September 29, 2022, with application number 202211202134.2 and application name “A Data Storage Method and Device”, the entire contents of which are incorporated by reference in this application.
技术领域Technical Field
本申请涉及数据处理技术领域,具体涉及一种数据存储方法及装置。The present application relates to the field of data processing technology, and in particular to a data storage method and device.
背景技术Background technique
数据库是存放数据的仓库,存储了大量数据。为了保证查询效率,在数据库中,需要按照一定的结构来组织数据,即采用索引引擎存储数据。目前,数据库常用的索引引擎有日志结构合并树(log structured merge tree,LSM tree)、B+树(B+ tree)等。The database is a warehouse for storing data, storing a large amount of data. In order to ensure query efficiency, the data in the database needs to be organized according to a certain structure, that is, an index engine is used to store data. At present, the commonly used index engines for databases include log structured merge tree (LSM tree) and B+ tree.
对数据库不同的访问操作,给数据库带来不同的负载特性。不同的索引引擎适应不同的负载特性。例如日志结构合并树结构更适合写操作发生频率较高时的负载特性,B+树结构更适合读操作发生频率较高时的负载特性。Different access operations to the database bring different load characteristics to the database. Different index engines adapt to different load characteristics. For example, the log structure merge tree structure is more suitable for the load characteristics when the write operation frequency is high, and the B+ tree structure is more suitable for the load characteristics when the read operation frequency is high.
不同用户对数据库的访问操作可能是不同的,所带来的负载特性也是不同的。因此,无论数据库采用哪种索引引擎,都难以适应多个用户的访问操作所带来的负载特性,从而导致对数据库的访问性能下降。Different users may have different access operations to the database, and the load characteristics they bring are also different. Therefore, no matter which index engine the database uses, it is difficult to adapt to the load characteristics brought by the access operations of multiple users, resulting in a decrease in the access performance of the database.
发明内容Summary of the invention
本申请实施例提供了一种数据存储方法及装置,可以由用户配置或者根据数据表负载特性的变化调整数据表的索引引擎。The embodiments of the present application provide a data storage method and device, which can be configured by a user or adjust the index engine of a data table according to changes in the load characteristics of the data table.
第一方面,提供一种数据存储方法,应用于存储系统中的控制装置,存储系统还包括存储装置,存储装置存储有用户的数据表;该方法包括:提供配置接口,配置接口用于供用户配置数据表的负载特性为读密集型负载或者写密集型负载;当配置接口指示用户配置数据表的负载特性为读密集型负载时,指示存储装置按照第一索引引擎,存储数据表中的数据;第一索引引擎与读密集型负载匹配;当配置接口指示用户配置数据表的负载特性为写密集型负载时,指示存储装置按照第二索引引擎,存储数据表中的数据;第二索引引擎与写密集型负载匹配。In a first aspect, a data storage method is provided, which is applied to a control device in a storage system, wherein the storage system also includes a storage device, which stores a user's data table; the method includes: providing a configuration interface, the configuration interface being used for the user to configure the load characteristics of the data table as a read-intensive load or a write-intensive load; when the configuration interface indicates that the user configures the load characteristics of the data table as a read-intensive load, instructing the storage device to store the data in the data table according to a first index engine; the first index engine matches the read-intensive load; when the configuration interface indicates that the user configures the load characteristics of the data table as a write-intensive load, instructing the storage device to store the data in the data table according to a second index engine; the second index engine matches the write-intensive load.
通过该方法,用户可以配置其数据表的索引引擎,由此,用户可以根据数据表所服务的业务的变化,调整数据表的索引引擎,使得索引引擎与变化后业务所导致的负载特性相匹配,提高数据表的访问性能。Through this method, users can configure the index engine of their data table. Thus, users can adjust the index engine of the data table according to changes in the business served by the data table, so that the index engine matches the load characteristics caused by the changed business, thereby improving the access performance of the data table.
在一种可能的实现方式中,第一索引引擎至少包括B+树结构,第二索引引擎至少包括日志结构合并树结构。In a possible implementation, the first index engine includes at least a B+ tree structure, and the second index engine includes at least a log structure merged with a tree structure.
在该实现方式中,B+树结构属于读友好型索引引擎,第一索引引擎包括B+树结构,可提高第一索引引擎和读密集型负载匹配度。日志结构合并树结构属于写友好型索引引擎,第二索引引擎包括日志结构合并树结构,可提高第二索引引擎和写密集型负载匹配度。In this implementation, the B+ tree structure belongs to a read-friendly index engine, and the first index engine includes the B+ tree structure, which can improve the matching degree between the first index engine and read-intensive loads. The log structure merge tree structure belongs to a write-friendly index engine, and the second index engine includes the log structure merge tree structure, which can improve the matching degree between the second index engine and write-intensive loads.
在一种可能的实现方式中,在指示存储装置按照第一索引引擎,存储数据表中的数据之前,数据表中的索引引擎是第二索引引擎;指示存储装置按照第一索引引擎,存储数据表中的数据,包括:指示存储装置先将数据表中的索引引擎从第二索引引擎迁移为第三索引引擎,第三索引引擎的结构介于第一索引引擎的结构和第二索引引擎的结构之间;再指示存储装置将数据表的索引引擎从第三索引引擎迁移为第一索引引擎。其中,第三索引引擎可以称为混合型索引引擎。In a possible implementation, before instructing the storage device to store the data in the data table according to the first index engine, the index engine in the data table is the second index engine; instructing the storage device to store the data in the data table according to the first index engine includes: instructing the storage device to first migrate the index engine in the data table from the second index engine to the third index engine, the structure of the third index engine being between the structure of the first index engine and the structure of the second index engine; and then instructing the storage device to migrate the index engine of the data table from the third index engine to the first index engine. The third index engine can be called a hybrid index engine.
在该实现方式中,可以将数据表的索引引擎从写友好型索引引擎切换为混合型索引引擎,然后,再从混合型索引引擎切换为读友好型索引引擎,实现了索引引擎的逐步切换,可以避免结构差异较大的索引引擎之间的切换所带来的索引引擎变化开销。In this implementation, the index engine of the data table can be switched from a write-friendly index engine to a hybrid index engine, and then from the hybrid index engine to a read-friendly index engine, thereby realizing a gradual switching of the index engine and avoiding the index engine change overhead caused by switching between index engines with large structural differences.
在一种可能的实现方式中,数据表具有本地二级索引,配置接口还用于供用户配置本地二级索引的负载特性为读密集型负载或者写密集型负载;当配置接口指示用户配置本地二级索引的负载特性为读密集型负载时,指示存储装置按照第一索引引擎,存储本地二级索引下的数据;当配置接口指示用户配置本地二级索引的负载特性为写密集型负载时,指示存储装置按照第二索引引擎,存储本地二级索引下的数据。In one possible implementation, the data table has a local secondary index, and the configuration interface is also used for the user to configure the load characteristics of the local secondary index as a read-intensive load or a write-intensive load; when the configuration interface instructs the user to configure the load characteristics of the local secondary index as a read-intensive load, the storage device is instructed to store the data under the local secondary index according to the first index engine; when the configuration interface instructs the user to configure the load characteristics of the local secondary index as a write-intensive load, the storage device is instructed to store the data under the local secondary index according to the second index engine.
在该实现方式中,用户可以配置其数据表中本地二级索引的索引引擎,由此,用户可以根据本地二级索引所服务的业务的变化,调整本地二级索引的索引引擎,使得索引引擎与变化后业务所导致的负载特性相匹配,提高本地二级索引的访问性能。In this implementation, users can configure the index engine of the local secondary index in their data table. Thus, users can adjust the index engine of the local secondary index according to changes in the business served by the local secondary index, so that the index engine matches the load characteristics caused by the changed business, thereby improving the access performance of the local secondary index.
在一种可能的实现方式中,数据表中的数据以键值对的形式存储。In a possible implementation, the data in the data table is stored in the form of key-value pairs.
在该实现方式中,该方法可以应用到键值数据库,可以提高键值数据库的访问性能。 In this implementation, the method can be applied to a key-value database, and can improve the access performance of the key-value database.
第二方面,提高了一种数据存储方法,应用于存储系统中的控制装置,存储系统还包括存储装置,存储装置存储有数据表;该方法包括:控制装置在数据表的索引引擎为第四索引引擎的情况下,监测数据表的操作放大,操作放大包括从数据表读取数据的读放大或者向数据表写入数据的写放大;当操作放大包括读放大,且读放大大于第一阈值时,指示存储装置按照第五索引引擎,存储数据表中的数据;其中,第五索引引擎与读密集型负载的匹配度大于第四索引引擎与读密集型负载的匹配度;当操作放大包括写放大,且写放大大于第二阈值时,指示存储装置按照第六索引引擎,存储数据表中的数据;其中,第六索引引擎与写密集型负载的匹配度大于第四索引引擎与写密集型负载的匹配度。In a second aspect, a data storage method is provided, which is applied to a control device in a storage system, wherein the storage system also includes a storage device, which stores a data table; the method includes: when the index engine of the data table is a fourth index engine, the control device monitors the operation amplification of the data table, the operation amplification including read amplification of reading data from the data table or write amplification of writing data to the data table; when the operation amplification includes read amplification and the read amplification is greater than a first threshold value, instructing the storage device to store the data in the data table according to the fifth index engine; wherein the matching degree between the fifth index engine and the read-intensive load is greater than the matching degree between the fourth index engine and the read-intensive load; when the operation amplification includes write amplification and the write amplification is greater than a second threshold value, instructing the storage device to store the data in the data table according to the sixth index engine; wherein the matching degree between the sixth index engine and the write-intensive load is greater than the matching degree between the fourth index engine and the write-intensive load.
该方法,可以监测数据表的读放大或写放大,并根据读放大或写放大,判断数据表当前的索引引擎与数据表的负载特性是否匹配,以及索引引擎的变化方向,从而可以向匹配数据表的负载特性的方向,调整索引引擎,使得索引引擎与数据表的负载特性相匹配,从而可以提高数据表的访问性能。This method can monitor the read amplification or write amplification of the data table, and based on the read amplification or write amplification, determine whether the current index engine of the data table matches the load characteristics of the data table, as well as the direction of change of the index engine, so that the index engine can be adjusted in the direction of matching the load characteristics of the data table, so that the index engine matches the load characteristics of the data table, thereby improving the access performance of the data table.
在一种可能的实现方式中,操作放大同时包括读放大和写放大;当操作放大包括读放大,且读放大大于第一阈值时,指示存储装置按照第五索引引擎,存储数据表中的数据,包括:当读放大大于第一阈值,且写放大小于第三阈值时,指示存储装置按照第五索引引擎,存储数据表中的数据。In one possible implementation, operation amplification includes both read amplification and write amplification; when operation amplification includes read amplification and the read amplification is greater than a first threshold, instructing the storage device to store the data in the data table according to the fifth index engine, including: when the read amplification is greater than the first threshold and the write amplification is less than a third threshold, instructing the storage device to store the data in the data table according to the fifth index engine.
在该实现方式中,可以在读放大比较大,且写放大比较小的情况下,向读友好型索引引擎的方向调整数据表的索引引擎,从而可以均衡读放大和写放大,提高数据表的综合访问性能。In this implementation, when read amplification is relatively large and write amplification is relatively small, the index engine of the data table can be adjusted towards a read-friendly index engine, thereby balancing read amplification and write amplification and improving the overall access performance of the data table.
在一种可能的实现方式中,操作放大同时包括读放大和写放大;当操作放大包括写放大,且写放大大于第二阈值时,指示存储装置按照第六索引引擎,存储数据表中的数据,包括:当写放大大于第二阈值,且读放大小于第四阈值时,指示存储装置按照第六索引引擎,存储数据表中的数据。In one possible implementation, operation amplification includes both read amplification and write amplification; when operation amplification includes write amplification, and the write amplification is greater than a second threshold, instructing the storage device to store the data in the data table according to the sixth index engine, including: when the write amplification is greater than the second threshold and the read amplification is less than the fourth threshold, instructing the storage device to store the data in the data table according to the sixth index engine.
在该实现方式中,可以在写放大比较大,且读放大比较小的情况下,向写友好型索引引擎的方向调整数据表的索引引擎,从而可以均衡读放大和写放大,提高数据表的综合访问性能。In this implementation, when write amplification is relatively large and read amplification is relatively small, the index engine of the data table can be adjusted towards a write-friendly index engine, thereby balancing read amplification and write amplification and improving the overall access performance of the data table.
在一种可能的实现方式中,第五索引引擎至少包括B+树结构,第六索引引擎至少包括日志结构合并树结构。In a possible implementation, the fifth indexing engine includes at least a B+ tree structure, and the sixth indexing engine includes at least a log structure merged with a tree structure.
在该实现方式中,B+树结构属于读友好型索引引擎,第五索引引擎包括B+树结构,可提高第五索引引擎和读密集型负载匹配度。日志结构合并树结构属于写友好型索引引擎,第六索引引擎包括日志结构合并树结构,可提高第六索引引擎和写密集型负载匹配度。In this implementation, the B+ tree structure belongs to a read-friendly index engine, and the fifth index engine includes the B+ tree structure, which can improve the matching degree between the fifth index engine and read-intensive loads. The log structure merge tree structure belongs to a write-friendly index engine, and the sixth index engine includes the log structure merge tree structure, which can improve the matching degree between the sixth index engine and write-intensive loads.
在一种可能的实现方式中,数据表具有本地二级索引LSI,操作放大为操作本地二级索引下的数据所产生的放大;指示存储装置按照第五索引引擎,存储数据表中的数据,包括:指示存储装置按照第五索引引擎,存储本地二级索引下的数据;或者,指示存储装置按照第六索引引擎,存储数据表中的数据,包括:指示存储装置按照第六索引引擎,存储本地二级索引下的数据。In one possible implementation, the data table has a local secondary index LSI, and the operation amplification is the amplification generated by operating the data under the local secondary index; instructing the storage device to store the data in the data table according to the fifth index engine, including: instructing the storage device to store the data under the local secondary index according to the fifth index engine; or, instructing the storage device to store the data in the data table according to the sixth index engine, including: instructing the storage device to store the data under the local secondary index according to the sixth index engine.
在该实现方式中,可以根据数据表中本地二级索引的索引引擎的操作放大,调整本地二级索引的索引引擎,使得索引引擎与本地二级索引的负载特性相匹配,从而可以提高数据表中本地二级索引的访问性能。In this implementation, the index engine of the local secondary index in the data table can be adjusted according to the operation amplification of the index engine of the local secondary index so that the index engine matches the load characteristics of the local secondary index, thereby improving the access performance of the local secondary index in the data table.
第三方面,提供了一种数据存储装置,配置于存储系统中的控制装置,存储系统还包括存储装置,存储装置存储有用户的数据表;数据存储装置包括:提供模块,用于提供配置接口,配置接口用于供用户配置数据表的负载特性为读密集型负载或者写密集型负载;指示模块,用于当配置接口指示用户配置数据表的负载特性为读密集型负载时,指示存储装置按照第一索引引擎,存储数据表中的数据;第一索引引擎与读密集型负载匹配;指示模块还用于当配置接口指示用户配置数据表的负载特性为写密集型负载时,指示存储装置按照第二索引引擎,存储数据表中的数据;第二索引引擎与写密集型负载匹配。According to a third aspect, a data storage device is provided, which is configured in a control device in a storage system. The storage system also includes a storage device, which stores a user's data table. The data storage device includes: a providing module, which is used to provide a configuration interface, and the configuration interface is used for the user to configure the load characteristics of the data table as a read-intensive load or a write-intensive load; an indicating module, which is used to instruct the storage device to store the data in the data table according to a first index engine when the configuration interface indicates that the user configures the load characteristics of the data table as a read-intensive load; the first index engine matches the read-intensive load; the indicating module is also used to instruct the storage device to store the data in the data table according to a second index engine when the configuration interface indicates that the user configures the load characteristics of the data table as a write-intensive load; the second index engine matches the write-intensive load.
在一种可能的实现方式中,第一索引引擎至少包括B+树结构,第二索引引擎至少包括日志结构合并树结构。In a possible implementation, the first index engine includes at least a B+ tree structure, and the second index engine includes at least a log structure merged with a tree structure.
在一种可能的实现方式中,在指示存储装置按照第一索引引擎,存储数据表中的数据之前,数据表中的索引引擎是第二索引引擎;指示模块还用于:指示存储装置先将数据表中的索引引擎从第二索引引擎迁移为第三索引引擎,第三索引引擎的结构介于第一索引引擎的结构和第二索引引擎的结构之间;再指示存储装置将数据表的索引引擎从第三索引引擎迁移为第一索引引擎。In one possible implementation, before instructing the storage device to store the data in the data table according to the first index engine, the index engine in the data table is the second index engine; the instruction module is also used to: instruct the storage device to first migrate the index engine in the data table from the second index engine to the third index engine, and the structure of the third index engine is between the structure of the first index engine and the structure of the second index engine; and then instruct the storage device to migrate the index engine of the data table from the third index engine to the first index engine.
在一种可能的实现方式中,数据表具有本地二级索引,配置接口还用于供用户配置本地二级索引的负载特性为读密集型负载或者写密集型负载;指示模块还用于:当配置接口指示用户配置本地二级索引的负载特性为读密集型负载时,指示存储装置按照第一索引引擎,存储本地二级索引下的数据;当配置接口指示用户配置本地二级索引的负载特性为写密集型负载时,指示存储装置按照第二索引引擎,存储本地二级索引下的数据。In one possible implementation, the data table has a local secondary index, and the configuration interface is also used for the user to configure the load characteristics of the local secondary index as a read-intensive load or a write-intensive load; the indication module is also used to: when the configuration interface instructs the user to configure the load characteristics of the local secondary index as a read-intensive load, instruct the storage device to store the data under the local secondary index according to the first index engine; when the configuration interface instructs the user to configure the load characteristics of the local secondary index as a write-intensive load, instruct the storage device to store the data under the local secondary index according to the second index engine.
第四方面,提供了一种数据存储装置,配置于存储系统中的控制装置,存储系统还包括存储装置,存 储装置存储有数据表;该数据存储装置包括:监测模块,用于控制装置在数据表的索引引擎为第四索引引擎的情况下,监测数据表的操作放大,操作放大包括从数据表读取数据的读放大或者向数据表写入数据的写放大;指示模块,用于当操作放大包括读放大,且读放大大于第一阈值时,指示存储装置按照第五索引引擎,存储数据表中的数据;其中,第五索引引擎与读密集型负载的匹配度大于第四索引引擎与读密集型负载的匹配度;指示模块还用于当操作放大包括写放大,且写放大大于第二阈值时,指示存储装置按照第六索引引擎,存储数据表中的数据;其中,第六索引引擎与写密集型负载的匹配度大于第四索引引擎与写密集型负载的匹配度。In a fourth aspect, a data storage device is provided, which is configured in a control device of a storage system, wherein the storage system further comprises a storage device, a storage device A storage device stores a data table; the data storage device includes: a monitoring module, which is used to control the device to monitor the operation amplification of the data table when the index engine of the data table is a fourth index engine, the operation amplification including read amplification of reading data from the data table or write amplification of writing data to the data table; an indication module, which is used to instruct the storage device to store the data in the data table according to a fifth index engine when the operation amplification includes read amplification and the read amplification is greater than a first threshold value; wherein the matching degree between the fifth index engine and the read-intensive load is greater than the matching degree between the fourth index engine and the read-intensive load; the indication module is also used to instruct the storage device to store the data in the data table according to a sixth index engine when the operation amplification includes write amplification and the write amplification is greater than a second threshold value; wherein the matching degree between the sixth index engine and the write-intensive load is greater than the matching degree between the fourth index engine and the write-intensive load.
在一种可能的实现方式中,操作放大同时包括读放大和写放大;指示模块用于:当读放大大于第一阈值,且写放大小于第三阈值时,指示存储装置按照第五索引引擎,存储数据表中的数据。In a possible implementation, the operation amplification includes both read amplification and write amplification; the indication module is used to: when the read amplification is greater than a first threshold and the write amplification is less than a third threshold, instruct the storage device to store the data in the data table according to the fifth index engine.
在一种可能的实现方式中,操作放大同时包括读放大和写放大;指示模块用于:当写放大大于第二阈值,且读放大小于第四阈值时,指示存储装置按照第六索引引擎,存储数据表中的数据。In a possible implementation, the operation amplification includes both read amplification and write amplification; the indication module is used to: when the write amplification is greater than the second threshold and the read amplification is less than the fourth threshold, instruct the storage device to store the data in the data table according to the sixth index engine.
在一种可能的实现方式中,第五索引引擎至少包括B+树结构,第六索引引擎至少包括日志结构合并树结构。In a possible implementation, the fifth indexing engine includes at least a B+ tree structure, and the sixth indexing engine includes at least a log structure merged with a tree structure.
在一种可能的实现方式中,数据表具有本地二级索引LSI,操作放大为操作本地二级索引下的数据所产生的放大;指示模块用于:指示存储装置按照第五索引引擎,存储本地二级索引下的数据;或者,指示存储装置按照第六索引引擎,存储本地二级索引下的数据。In one possible implementation, the data table has a local secondary index LSI, and the operation amplification is the amplification generated by operating the data under the local secondary index; the indication module is used to: instruct the storage device to store the data under the local secondary index according to the fifth index engine; or, instruct the storage device to store the data under the local secondary index according to the sixth index engine.
第五方面,提供了一种计算设备集群,包括至少一个计算设备,每个计算设备包括处理器和存储器;该至少一个计算设备的处理器用于执行该至少一个计算设备的存储器中存储的指令,以使得计算设备集群执行如第一方面所提供方法或者如第二方面所提供的方法。In a fifth aspect, a computing device cluster is provided, comprising at least one computing device, each computing device comprising a processor and a memory; the processor of the at least one computing device is used to execute instructions stored in the memory of the at least one computing device, so that the computing device cluster performs the method provided in the first aspect or the method provided in the second aspect.
第六方面,提供了一种包含指令的计算机程序产品,当指令被计算设备集群运行时,使得计算设备集群执行如第一方面所提供方法或者如第二方面所提供的方法。In a sixth aspect, a computer program product comprising instructions is provided. When the instructions are executed by a computing device cluster, the computing device cluster executes the method provided in the first aspect or the method provided in the second aspect.
第七方面,提供了一种计算机可读存储介质,包括计算机程序指令,当计算机程序指令由计算设备集群执行时,计算设备集群执行如第一方面所提供方法或者如第二方面所提供的方法。In a seventh aspect, a computer-readable storage medium is provided, comprising computer program instructions. When the computer program instructions are executed by a computing device cluster, the computing device cluster executes the method provided in the first aspect or the method provided in the second aspect.
本申请实施例提供的数据存储方法及装置,可使得用户可以配置数据表的索引引擎,或者根据数据表的负载特性,调整数据表的索引引擎,从而使得数据表的索引引擎与数据表的负载特性相匹配,提高了数据表的访问性能。The data storage method and device provided in the embodiments of the present application allow the user to configure the index engine of the data table, or adjust the index engine of the data table according to the load characteristics of the data table, so that the index engine of the data table matches the load characteristics of the data table, thereby improving the access performance of the data table.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
图1A为日志结构合并树的结构示意图;FIG1A is a schematic diagram of the structure of a log structure merge tree;
图1B为B+树的结构示意图;FIG1B is a schematic diagram of the structure of a B+ tree;
图2为本申请实施例提供的一种存储系统的结构示意图;FIG2 is a schematic diagram of the structure of a storage system provided in an embodiment of the present application;
图3为本申请实施例提供的初始负载特性配置子模块的结构示意图;FIG3 is a schematic diagram of the structure of an initial load characteristic configuration submodule provided in an embodiment of the present application;
图4为本申请实施例提供的一种数据存储方案的流程图;FIG4 is a flow chart of a data storage solution provided in an embodiment of the present application;
图5为本申请实施例提供的索引引擎的示意图;FIG5 is a schematic diagram of an index engine provided in an embodiment of the present application;
图6为本申请实施例提供的一种数据存储方案的流程图;FIG6 is a flow chart of a data storage solution provided in an embodiment of the present application;
图7为本申请实施例提供的一种数据存储方案的流程图;FIG7 is a flow chart of a data storage solution provided in an embodiment of the present application;
图8为本申请实施例提供的一种数据存储方法的流程图;FIG8 is a flow chart of a data storage method provided in an embodiment of the present application;
图9为本申请实施例提供的一种数据存储方法的流程图;FIG9 is a flow chart of a data storage method provided in an embodiment of the present application;
图10为本申请实施例提供的一种数据存储装置的结构示意图;FIG10 is a schematic diagram of the structure of a data storage device provided in an embodiment of the present application;
图11为本申请实施例提供的一种数据存储装置的结构示意图;FIG11 is a schematic diagram of the structure of a data storage device provided in an embodiment of the present application;
图12为本申请实施例提供的一种计算设备的结构示意图;FIG12 is a schematic diagram of the structure of a computing device provided in an embodiment of the present application;
图13为本申请实施例提供的一种计算设备集群的结构示意图;FIG13 is a schematic diagram of the structure of a computing device cluster provided in an embodiment of the present application;
图14为本申请实施例提供的一种计算设备集群的结构示意图;FIG14 is a schematic diagram of the structure of a computing device cluster provided in an embodiment of the present application;
图15为本申请实施例提供的一种计算设备的结构示意图;FIG15 is a schematic diagram of the structure of a computing device provided in an embodiment of the present application;
图16为本申请实施例提供的一种计算设备集群的结构示意图;FIG16 is a schematic diagram of the structure of a computing device cluster provided in an embodiment of the present application;
图17为本申请实施例提供的一种计算设备集群的结构示意图。FIG. 17 is a schematic diagram of the structure of a computing device cluster provided in an embodiment of the present application.
具体实施方式Detailed ways
下面将结合附图,对本申请实施例中的技术方案进行描述。显然,所描述的实施例仅是本申请一部分实施例,而不是全部的实施例。其中,在本申请实施例中,“多个”是指“至少两个”。The technical solutions in the embodiments of the present application will be described below in conjunction with the accompanying drawings. Obviously, the described embodiments are only part of the embodiments of the present application, not all of the embodiments. Among them, in the embodiments of the present application, "multiple" means "at least two".
键值数据库(key-value store)以键值对的方式存储数据,其中键(key)作为数据的唯一标识符, 值(value)是指数据内容,可以是包括从简单到复杂的复合对象的任何内容。键值数据库存储数据的形式简单,提供分布式处理能力,且具有响应快速等优势。另外,键值数据库属于非关系数据库(not only SQL,NoSQL),可以应对大规模数据存储。因此,键值数据库已广泛应用到计算机系统,特别是云存储(cloud storage)领域。A key-value store stores data in the form of key-value pairs, where the key is the unique identifier of the data. Value refers to the data content, which can be anything from simple to complex composite objects. Key-value databases store data in a simple form, provide distributed processing capabilities, and have advantages such as fast response. In addition, key-value databases are non-relational databases (not only SQL, NoSQL) and can handle large-scale data storage. Therefore, key-value databases have been widely used in computer systems, especially in the field of cloud storage.
键值数据库可以服务于多个用户。示例性的,用户可以为云存储的租户。用户可以通过索引(index)引擎在数据库中数据查询。其中,索引引擎是一种用于高效查询数据的数据存储结构。在本申请实施例中,索引引擎也可以称为数据存储结构。The key-value database can serve multiple users. Exemplarily, the user can be a tenant of cloud storage. The user can query data in the database through an index engine. Among them, the index engine is a data storage structure for efficiently querying data. In the embodiment of the present application, the index engine can also be called a data storage structure.
具体而言,用户可以在键值数据库中创建数据表,以存储和管理用户的数据。其中,数据表采用本地主索引(local primary index,LPI),建立key到value的映射,并采用索引引擎,来存储key到value的映射关系,以存储该数据表中的数据。其中,key用于标识数据,可以是数据的名称。Value代表数据,可根据业务需求定义,例如学生某一学科或多个学科的成绩。其中,采用本地主索引,建立key到value的映射,是指按照本地主索引这一数据存储方式,存储key和value。Specifically, users can create data tables in the key-value database to store and manage their data. The data table uses a local primary index (LPI) to establish a key-to-value mapping, and uses an index engine to store the key-to-value mapping relationship to store the data in the data table. The key is used to identify data and can be the name of the data. Value represents data and can be defined according to business needs, such as a student's grades in a subject or multiple subjects. Using a local primary index to establish a key-to-value mapping means storing keys and values in accordance with the local primary index data storage method.
在一些实施例中,一个数据表可以由一个或多个数据分区实例组成。其中,一个分区实例可以用于存储指定键范围(key range)的数据。其中,该指定键范围的数据可以称为该分区实例中的数据。其中,一个分区实例可以通过一个本地主索引来建立对应键范围中的key到value的映射,以存储该分区实例中的数据。当一个数据表由多个数据分区实例组成时,该多个数据分区实例的本地主索引采用的索引引擎可以相同,即同一数据表中多个数据分区实例可以采用相同的索引引擎存储数据。因此,可以将数据表中数据分区实例的本地主索引的索引引擎,称为数据表的索引引擎。In some embodiments, a data table may be composed of one or more data partition instances. A partition instance may be used to store data in a specified key range. The data in the specified key range may be referred to as data in the partition instance. A partition instance may establish a mapping from key to value in the corresponding key range through a local primary index to store data in the partition instance. When a data table is composed of multiple data partition instances, the index engines used by the local primary indexes of the multiple data partition instances may be the same, that is, multiple data partition instances in the same data table may use the same index engine to store data. Therefore, the index engine of the local primary index of the data partition instance in the data table may be referred to as the index engine of the data table.
在一些实施例中,一个value可以包括多条信息(例如value是复杂复合信息的对象)。为了提高查询效率,通常,可以采用构建本地二级索引(local second index,LSI),建立该多条信息中的某一条信息到key的映射关系。其中,该条信息可以称为子value。其中,本地二级索引也采用索引引擎,来存储子value到key的映射关系。In some embodiments, a value may include multiple pieces of information (for example, a value is an object of complex composite information). In order to improve query efficiency, a local secondary index (LSI) may be constructed to establish a mapping relationship between a certain piece of information among the multiple pieces of information and a key. The piece of information may be referred to as a sub-value. The local secondary index also uses an index engine to store the mapping relationship between the sub-value and the key.
在查询某个或些key对应的value时,可以先在本地二级索引中,通过key所对应的子value,筛选key。然后,通过筛选得到的key,在本地主索引中查询value,从而提高了查询效率。以存储学生成绩的数据表或数据分区示例为例,其中,key为学生的学号,value包括学生的语言、数学、英语、物理、化学等多个学科的成绩。本地主索引构建了学号到多个学科成绩的映射。本地二级索引构建了语文成绩到学号的映射,即子value为语文成绩。可以设定用户需要查询语文成绩大于90分的学生的各学科的成绩。那么可以先在本地二级索引中筛选出大于90分的语文成绩所对应的学号,然后,在本地主索引中,根据筛选出的学号,查询value,从而得到语文成绩大于90分的学生的各学科的成绩。因此,通过本地二级索引可以提高查询效率。When querying the value corresponding to a certain key or keys, you can first filter the key in the local secondary index by the sub-value corresponding to the key. Then, query the value in the local primary index by the filtered key, thereby improving the query efficiency. Take the data table or data partition example that stores student scores as an example, where the key is the student's student ID, and the value includes the student's scores in multiple subjects such as language, mathematics, English, physics, and chemistry. The local primary index constructs a mapping from student ID to scores in multiple subjects. The local secondary index constructs a mapping from Chinese scores to student IDs, that is, the sub-value is the Chinese score. It can be set that the user needs to query the scores of each subject for students whose Chinese scores are greater than 90 points. Then, in the local secondary index, you can first filter out the student IDs corresponding to the Chinese scores greater than 90 points, and then, in the local primary index, query the value according to the filtered student IDs, so as to obtain the scores of each subject for students whose Chinese scores are greater than 90 points. Therefore, the query efficiency can be improved through the local secondary index.
针对一个数据表,可以构建一个或多个本地二级索引。其中,不同的本地二级索引中对应的子value可以不同,也可以部分相同。For a data table, one or more local secondary indexes can be constructed. The corresponding sub-values in different local secondary indexes can be different or partially the same.
下文描述中,本地主索引可以简称为主索引,本地二级索引可以简称为二级索引。另外,当对本地主索引、本地二级索引不做特别区分时,它们可以被简称为索引。In the following description, the local primary index may be referred to as the primary index, and the local secondary index may be referred to as the secondary index. In addition, when no special distinction is made between the local primary index and the local secondary index, they may be referred to as indexes.
上文示例介绍了键值数据库中的索引。接下来,介绍键值数据库的负载特性。The above example introduces indexes in a key-value database. Next, we will introduce the load characteristics of a key-value database.
针对数据库的访问操作可以包括读操作和写操作,相应地,负载特性可以包括读密集型负载、写密集负载以及混合型负载。混合型负载为介于读密集型负载和写密集型负载之间的负载特性,该负载特性表明读操作和写操作的发生频次相差不大,或者说,读操作和写操作比较均衡。Access operations to a database may include read operations and write operations, and accordingly, load characteristics may include read-intensive load, write-intensive load, and mixed load. Mixed load is a load characteristic between read-intensive load and write-intensive load, which indicates that the frequency of read operations and write operations is not much different, or in other words, the read operations and write operations are relatively balanced.
其中,当读操作的发生频率比上写操作的发生频率的比值大于阈值A1时,负载特性具体为读密集型负载。阈值A1为预设值。示例性的,阈值A1大于或等于4。在一个例子中,阈值A1为9。当写操作的发生频率比上读操作的发生频率的比值大于阈值A2时,负载特性为写密集型负载。阈值A2为预设值。示例性的,阈值A2大于或等于4。在一个例子中,阈值A2为9。当读操作的发生频率比上写操作的发生频率的比值不大于阈值A1,且写操作的发生频率比上读操作的发生频率的比值不大于阈值A2时,负载特性为混合型负载。Among them, when the ratio of the occurrence frequency of read operations to the occurrence frequency of write operations is greater than a threshold value A1, the load characteristic is specifically a read-intensive load. Threshold A1 is a preset value. Exemplarily, threshold A1 is greater than or equal to 4. In one example, threshold A1 is 9. When the ratio of the occurrence frequency of write operations to the occurrence frequency of read operations is greater than threshold A2, the load characteristic is a write-intensive load. Threshold A2 is a preset value. Exemplarily, threshold A2 is greater than or equal to 4. In one example, threshold A2 is 9. When the ratio of the occurrence frequency of read operations to the occurrence frequency of write operations is not greater than threshold A1, and the ratio of the occurrence frequency of write operations to the occurrence frequency of read operations is not greater than threshold A2, the load characteristic is a mixed load.
其中,针对索引的访问操作,即在索引所采用的索引引擎中进行访问操作,会为该索引带来负载特性。不同的访问操作,导致索引承载不同的负载特性。其中,针对本地主索引的访问操作所带来的负载特性称为主索引负载特性,针对本地二级索引的访问操作所带来的负载特性称为二级索引负载特性。Among them, the access operation to the index, that is, the access operation performed in the index engine used by the index, will bring load characteristics to the index. Different access operations cause the index to carry different load characteristics. Among them, the load characteristics brought by the access operation to the local primary index are called the primary index load characteristics, and the load characteristics brought by the access operation to the local secondary index are called the secondary index load characteristics.
数据表的负载特性取决于数据表所服务的业务类型以及用户行为的类型(例如用户对数据表的访问操作、利用本地二级索引查询数据表中的数据)。举例而言,对于服务于直播业务的数据表而言,在直播业务 高峰期(如晚上8时到10时),数据表被高频地读和写,此时的负载特性为混合型负载。在业务低峰期(如凌晨0时到6时),用户将数据表内的数据进行转储备份,数据表内的数据被高频地读,此时数据表为读密集型负载。对于服务于数据备份业务的数据表而言,其负载特性通常为写密集型负载。其中,当利用服务于数据备份业务的数据表,为数据丢失的数据表恢复数据时,需要从服务于数据备份业务的数据表中读取大量数据,服务于数据备份业务的数据表的负载特性转换为读密集型负载。The load characteristics of a data table depend on the type of business it serves and the type of user behavior (such as user access to the data table and querying data in the data table using local secondary indexes). During peak hours (such as 8pm to 10pm), data tables are read and written frequently, and the load characteristics at this time are mixed loads. During low business peak hours (such as 0am to 6am), users dump and back up the data in the data table, and the data in the data table is read frequently. At this time, the data table is read-intensive. For data tables serving the data backup business, their load characteristics are usually write-intensive. Among them, when using the data table serving the data backup business to restore data for the data table that has lost data, a large amount of data needs to be read from the data table serving the data backup business, and the load characteristics of the data table serving the data backup business are converted to read-intensive loads.
本地二级索引用提供了对数据表中数据的快速访问,不同的本地二级索引可能记录数据表中不同子value,由此,可通过不同子value实现对数据表中数据的查询。因此,数据表内不同的本地二级索引的负载特性取决于用户的查询行为。举例而言,对于包含两个本地二级索引的数据表而言,本地二级索引分别记为LSI 1和LSI 2。在一定时间段中,若用户只以LSI 1记录的子value为查询条件访问数据表,则LSI1为混合型负载,LSI 2为写密集型负载。反之,则LSI 1为写密集型负载,LSI 2为混合型负载。Local secondary indexes provide fast access to data in a data table. Different local secondary indexes may record different sub-values in a data table. Therefore, the data in the data table can be queried through different sub-values. Therefore, the load characteristics of different local secondary indexes in a data table depend on the user's query behavior. For example, for a data table containing two local secondary indexes, the local secondary indexes are recorded as LSI 1 and LSI 2 respectively. In a certain period of time, if the user only accesses the data table with the sub-value recorded in LSI 1 as the query condition, LSI 1 is a mixed load and LSI 2 is a write-intensive load. Conversely, LSI 1 is a write-intensive load and LSI 2 is a mixed load.
不难理解,针对二级索引的访问操作也并非一成不变,因此,同一二级索引在不同时刻的负载特性也可能是不同的。It is not difficult to understand that the access operation for the secondary index is not static. Therefore, the load characteristics of the same secondary index at different times may also be different.
因此,无论是主索引的负载特性还是二级索引的负载特性,通常是动态变化的。Therefore, both the load characteristics of the primary index and the load characteristics of the secondary index usually change dynamically.
另外,在下文描述中,当对主索引负载特性和二级索引负载特性不做特别区分时,它们可以被简称为数据表负载特性。In addition, in the following description, when no special distinction is made between the primary index load characteristics and the secondary index load characteristics, they may be referred to as data table load characteristics for short.
有的索引引擎适合读操作,使得读放大小、读时延低,读操作体验好。该种索引引擎可以称为读友好型索引引擎。有的索引引擎适合写操作,使得写放大小、写时延低,写操作体验好。该种索引引擎可以称为写友好型索引引擎。Some index engines are suitable for read operations, which makes the read magnification small, the read latency low, and the read operation experience good. This type of index engine can be called a read-friendly index engine. Some index engines are suitable for write operations, which makes the write magnification small, the write latency low, and the write operation experience good. This type of index engine can be called a write-friendly index engine.
具体而言,读友好型索引引擎采用追加写(append)的方式写入数据,写入速度快,写操作时延低,具有较高得写性能。但是,追加写的方式使得数据表中的数据延迟更新,进而导致数据表中保留了多个历史版本的键值,增加了读放大,导致读操作时延高。Specifically, the read-friendly index engine uses append to write data, which has fast writing speed, low writing latency and high writing performance. However, the append writing method delays the update of data in the data table, which results in multiple historical versions of key values being retained in the data table, increasing read amplification and causing high read latency.
其中,日志结构合并树(log structured merge tree,LSM tree)结构为一种典型的写友好型索引引擎。LSM树以日志(log)的方式存储数据。如图1A所示,LSM树包括数据区和索引区(manifest)。其中,数据区是LSM树中存储数据的区域。其中,数据区可以位于硬盘,用于实现数据的持久化存储。数据区包括从上至下的多个存储层(如图1A所示的C1层、C2层、C3层、C4层、C5层、C6层)。其中,多个存储层中越上层的存储空间越小,越下层的存储空间越大。在数据写入时,先将数据写入到最上层,即C1层。当C1层的数据量达到预设值D1时,将C1层和C1的下一层(即C2层)的数据合并(compaction),并将合并后的数据移交到C2层。当C2层的数据量达到预设值D2时,将C2层和C2的下一层(C3)的数据合并,并将合并后的数据移交到C3层,依次类推,使得旧数据不断被移交到下层,新数据能够不断被写入到上层。另外,索引区可以用于加速定位某一键值在哪一存储层。Among them, the log structured merge tree (LSM tree) structure is a typical write-friendly index engine. The LSM tree stores data in the form of logs. As shown in FIG1A , the LSM tree includes a data area and an index area (manifest). Among them, the data area is the area in the LSM tree where data is stored. Among them, the data area can be located on a hard disk to achieve persistent storage of data. The data area includes multiple storage layers from top to bottom (C1 layer, C2 layer, C3 layer, C4 layer, C5 layer, and C6 layer as shown in FIG1A ). Among them, the storage space of the upper layers in the multiple storage layers is smaller, and the storage space of the lower layers is larger. When writing data, the data is first written to the top layer, that is, the C1 layer. When the amount of data in the C1 layer reaches the preset value D1, the data of the C1 layer and the next layer of C1 (that is, the C2 layer) are merged (compaction), and the merged data is transferred to the C2 layer. When the amount of data in layer C2 reaches the preset value D2, the data in layer C2 and the next layer (C3) of C2 are merged, and the merged data is transferred to layer C3, and so on, so that the old data is continuously transferred to the lower layer and the new data can be continuously written to the upper layer. In addition, the index area can be used to accelerate the location of which storage layer a certain key value is in.
读友好型索引引擎采用及时更新的方式,存储数据。在一个读友好型索引引擎中,只保留单个的键值,因此,读友好型索引引擎有利于数据读取,更适合读操作。而向读友好型索引引擎写入数据时,需要读出之前版本的键值,将键值更新后,再写入更新后的键值,导致较大的写开销。因此,读友好型索引引擎不适合写操作,导致写放大。The read-friendly index engine uses a timely update method to store data. In a read-friendly index engine, only a single key value is retained. Therefore, the read-friendly index engine is conducive to data reading and is more suitable for read operations. When writing data to a read-friendly index engine, it is necessary to read the key value of the previous version, update the key value, and then write the updated key value, resulting in a large write overhead. Therefore, the read-friendly index engine is not suitable for write operations, resulting in write amplification.
其中,B+树(B+ tree)结构为一种典型的读友好型索引引擎。如图1B所示,数据区由叶子节点组成,键值有序地存储在叶子节点中。B+树结构具有索引区,索引区用于加速定位某一键值在哪一叶子节点。Among them, the B+ tree structure is a typical read-friendly index engine. As shown in Figure 1B, the data area consists of leaf nodes, and key values are stored in the leaf nodes in an orderly manner. The B+ tree structure has an index area, which is used to accelerate the location of which leaf node a certain key value is in.
由于数据表的负载特性可能是变化的,可以从读密集型负载变化到写密集型负载,也可以从写密集型负载变化到读密集型负载。因此,数据表无论采用读友好型索引引擎还是采用写友好型索引引擎,在可能会发生索引引擎和负载特性不匹配的情况,从而导致访问时延增加,影响业务运行,以及用户体验。Since the load characteristics of a data table may change from read-intensive to write-intensive, or from write-intensive to read-intensive, whether a read-friendly or write-friendly index engine is used for a data table, there may be a mismatch between the index engine and the load characteristics, which will increase access latency, affect business operations, and user experience.
鉴于上述情况,本申请提供了一种数据存储方案,可以提供索引引擎配置接口,使得用户可以随时配置数据表的索引引擎。然后,可以按照用户配置的数据存储结构,存储数据表中的数据。不难理解,数据表的负载特性与数据表所服务的业务相关。例如,服务于数据备份业务的数据表而言,其负载特性通常为写密集型负载。其中,当利用服务于数据备份业务的数据表,为数据丢失的数据表恢复数据时,需要从服务于数据备份业务的数据表中读取大量数据,服务于数据备份业务的数据表的负载特性转换为读密集型负载。而用户可设置或预知数据表在什么时间服务于哪种业务。如此,用户可以在数据表切换所服务的业务时或之前,配置与切换后业务所带来的负载特性相匹配的索引引擎,从而使得数据表可以提供较优的访问性能,降低读或写时延。In view of the above situation, the present application provides a data storage solution that can provide an index engine configuration interface so that users can configure the index engine of the data table at any time. Then, the data in the data table can be stored according to the data storage structure configured by the user. It is not difficult to understand that the load characteristics of the data table are related to the business served by the data table. For example, for a data table serving a data backup business, its load characteristics are usually write-intensive loads. Among them, when using a data table serving a data backup business to restore data for a data table with lost data, a large amount of data needs to be read from the data table serving the data backup business, and the load characteristics of the data table serving the data backup business are converted to read-intensive loads. The user can set or predict what kind of business the data table serves at what time. In this way, the user can configure an index engine that matches the load characteristics brought by the switched business when or before the data table switches the business it serves, so that the data table can provide better access performance and reduce read or write delays.
接下来,对本申请实施例提供的数据存储方案进行示例介绍。Next, an example is given of the data storage solution provided in the embodiment of the present application.
首先,本申请实施例提供了一种可以用于实施数据存储方案的存储系统100。如图2所示,存储系统 100包括控制装置110和存储装置120。First, the present application embodiment provides a storage system 100 that can be used to implement a data storage solution. As shown in FIG. 2 , the storage system 100 includes a control device 110 and a storage device 120 .
其中,存储装置120为用于持久化存储数据的装置或设备。在一些实施例中,存储装置120可以为硬盘(disk),例如固态硬盘(solid state disk,SSD)。在其他实施例中,存储装置120可以为其他形式的具有持久化存储数据功能的装置或设备。本申请实施例对存储装置120的具体实现形式不做具体限定。The storage device 120 is a device or equipment for persistently storing data. In some embodiments, the storage device 120 may be a hard disk, such as a solid state disk (SSD). In other embodiments, the storage device 120 may be other forms of devices or equipment with a persistent data storage function. The embodiments of the present application do not specifically limit the specific implementation form of the storage device 120.
如图2所示,存储装置120可以包括数据表T1、数据表T2等多个数据表。其中,数据表T1、数据表T2可以属于数据库。该数据库具体可以为键值数据库。其中,该多个数据表有的数据表属于一个用户,有的数据表可以属于不同的用户。其中,如图2所示,数据表,例如数据表T1,可以包括本地主索引。在一些实施例中,数据表T1还可以包括本地二级索引B1和本地二级索引B2。As shown in FIG. 2 , the storage device 120 may include multiple data tables such as a data table T1 and a data table T2. Among them, the data table T1 and the data table T2 may belong to a database. The database may specifically be a key-value database. Among them, some of the multiple data tables belong to one user, and some of the data tables may belong to different users. Among them, as shown in FIG. 2 , a data table, such as the data table T1, may include a local primary index. In some embodiments, the data table T1 may also include a local secondary index B1 and a local secondary index B2.
控制模块110为具有数据处理能力的模块或部件。在一些实施例中,控制模块110可以为物理设备,例如服务器或处理器等。在一些实施例中,控制模块110可以为虚拟装置,例如虚拟机(virtual machine,VM)或者为容器(container)等。本申请实施例对控制模块110的具体实现形式不做具体限定。The control module 110 is a module or component with data processing capabilities. In some embodiments, the control module 110 may be a physical device, such as a server or a processor. In some embodiments, the control module 110 may be a virtual device, such as a virtual machine (VM) or a container. The embodiment of the present application does not specifically limit the specific implementation form of the control module 110.
控制模块110用于控制或调整存储装置120中数据表的索引引擎。其中,如图2所示,控制模块110可以包括配置模块111和处理模块112。配置模块111可以用于用户配置数据表的负载特性。处理模块112可以基于用户所配置的负载特性,配置与该负载特性匹配的索引引擎,并指示存储装置120按照该索引引擎存储数据。更具体地,配置模块111可以向用户提供配置接口,以便用户通过该配置接口输入负载特性。处理模块112可以从配置模块111获取用户输入的负载特性,并配置与该负载特性匹配的索引引擎,进而指示存储装置120按照该索引引擎,存储该用户的数据表中的数据。The control module 110 is used to control or adjust the index engine of the data table in the storage device 120. As shown in FIG2 , the control module 110 may include a configuration module 111 and a processing module 112. The configuration module 111 may be used for the user to configure the load characteristics of the data table. The processing module 112 may configure an index engine that matches the load characteristics based on the load characteristics configured by the user, and instruct the storage device 120 to store data according to the index engine. More specifically, the configuration module 111 may provide a configuration interface to the user so that the user can input the load characteristics through the configuration interface. The processing module 112 may obtain the load characteristics input by the user from the configuration module 111, and configure an index engine that matches the load characteristics, thereby instructing the storage device 120 to store the data in the user's data table according to the index engine.
在一些实施例中,如图2所示,配置模块111可以包括初始负载特性配置子模块111A,以及处理模块112可以包括索引引擎初始化子模块112A。其中,初始负载特性配置子模块111A可以在用户在数据库121中创建数据表时,向用户提供初始负载特性配置接口。用户可以通过该配置接口,输入初始负载特性。索引引擎初始化子模块112A可以该初始负载特性,配置与该初始负载特性匹配的索引引擎。之后,索引引擎初始化子模块112A可以指示存储装置120将配置的索引引擎,作为用户新创建的数据表的初始化索引引擎。In some embodiments, as shown in FIG. 2 , the configuration module 111 may include an initial load characteristic configuration submodule 111A, and the processing module 112 may include an index engine initialization submodule 112A. The initial load characteristic configuration submodule 111A may provide an initial load characteristic configuration interface to the user when the user creates a data table in the database 121. The user may input the initial load characteristic through the configuration interface. The index engine initialization submodule 112A may configure an index engine that matches the initial load characteristic based on the initial load characteristic. Afterwards, the index engine initialization submodule 112A may instruct the storage device 120 to use the configured index engine as the initialization index engine for the data table newly created by the user.
在一些实施例中,如图3所示,初始负载特性配置子模块111A可以包括主索引初始负载特性配置子模块111A1。主索引初始负载特性配置子模块111A1可以在用户在数据库121中创建数据表时,向用户提供主索引初始负载特性配置接口。用户可以通过该配置接口,输入主索引初始负载特性。索引引擎初始化子模块112A可以该主索引初始负载特性,配置与该主索引初始负载特性匹配的索引引擎。之后,索引引擎初始化子模块112A可以指示存储装置120将配置的索引引擎,作为用户新创建的数据表的初始化主索引。In some embodiments, as shown in FIG3 , the initial load characteristic configuration submodule 111A may include a main index initial load characteristic configuration submodule 111A1. The main index initial load characteristic configuration submodule 111A1 may provide a main index initial load characteristic configuration interface to the user when the user creates a data table in the database 121. The user may input the main index initial load characteristic through the configuration interface. The index engine initialization submodule 112A may configure an index engine that matches the main index initial load characteristic based on the main index initial load characteristic. Afterwards, the index engine initialization submodule 112A may instruct the storage device 120 to use the configured index engine as the initial main index of the data table newly created by the user.
在一些实施例中,如图3所示,初始负载特性配置子模块111A可以包括二级索引初始负载特性配置子模块111A2。二级索引初始负载特性配置子模块111A2可以在用户在数据表中创建二级索引时,向用户提供二级索引初始负载特性配置接口。用户可以通过该配置接口,输入二级索引初始负载特性。索引引擎初始化子模块112A可以该二级索引初始负载特性,配置与该二级索引初始负载特性匹配的索引引擎。之后,索引引擎初始化子模块112A可以指示存储装置120将配置的索引引擎,作为用户新创建的数据表的初始化二级索引。In some embodiments, as shown in FIG3 , the initial load characteristic configuration submodule 111A may include a secondary index initial load characteristic configuration submodule 111A2. The secondary index initial load characteristic configuration submodule 111A2 may provide a secondary index initial load characteristic configuration interface to the user when the user creates a secondary index in the data table. The user may input the secondary index initial load characteristic through the configuration interface. The index engine initialization submodule 112A may configure an index engine that matches the secondary index initial load characteristic based on the secondary index initial load characteristic. Afterwards, the index engine initialization submodule 112A may instruct the storage device 120 to use the configured index engine as the initial secondary index of the data table newly created by the user.
回到图2,在一些实施例中,配置模块111B可以包括负载特性调整子模块111B,以及处理模块112可以包括索引引擎调整子模块112B。其中,负载特性调整子模块111B可以在数据表创建完成之后,向用户提供负载特性调整接口。用户可以通过该调整接口,输入新的负载特性。索引引擎调整子模块112B可以根据新的负载特性,配置与新的负载特性匹配的索引引擎,并指示存储装置120将该用户数据表的索引引擎向与新的负载特性匹配的索引引擎迁移,或者,指示存储装置120将该用户数据表的索引引擎切换为与新的负载特性匹配的索引引擎。Returning to FIG. 2 , in some embodiments, the configuration module 111B may include a load characteristic adjustment submodule 111B, and the processing module 112 may include an index engine adjustment submodule 112B. The load characteristic adjustment submodule 111B may provide a load characteristic adjustment interface to the user after the data table is created. The user may input a new load characteristic through the adjustment interface. The index engine adjustment submodule 112B may configure an index engine that matches the new load characteristic based on the new load characteristic, and instruct the storage device 120 to migrate the index engine of the user data table to the index engine that matches the new load characteristic, or instruct the storage device 120 to switch the index engine of the user data table to an index engine that matches the new load characteristic.
其中,在本申请实施例中,迁移可以理解为逐步变化。举例而言,设定存在第一结构、第二结构和第三结构,其中,第一结构和第二结构之间相差较大,第三结构处于第一结构和第二结构之间,即第一结构和第三结构之间的差异,以及第二结构和第三结构之间的差异,均小于第一结构和第二结构之间的差异。第一结构向第二结构迁移具体为,第一结构先切换为第三结构,然后第三结构再切换至第二结构。从而可以避免差异较大结构之间的切换所带来的索引引擎变化开销。Among them, in the embodiments of the present application, migration can be understood as a gradual change. For example, a first structure, a second structure and a third structure are set, wherein the first structure and the second structure are quite different, and the third structure is between the first structure and the second structure, that is, the difference between the first structure and the third structure, and the difference between the second structure and the third structure are both smaller than the difference between the first structure and the second structure. The migration of the first structure to the second structure is specifically that the first structure is first switched to the third structure, and then the third structure is switched to the second structure. Thereby, the index engine change overhead caused by switching between structures with large differences can be avoided.
在一些实施例中,如图2所示,控制模块110还包括负载特性感知模块113。负载特性感知模块113可以感知数据表的负载特性。具体而言,负载特性感知模块113可以感知对数据表进行操作的操作放大。其中,当操作放大大于预设的阈值时,可以确定数据表的当前索引引擎与数据表的当前负载特性不匹配,需要调整索引引擎。并且可以根据操作的具体类型,确定当前负载特性的类型。 In some embodiments, as shown in FIG. 2 , the control module 110 further includes a load characteristic sensing module 113. The load characteristic sensing module 113 can sense the load characteristic of the data table. Specifically, the load characteristic sensing module 113 can sense the operation amplification of the operation on the data table. When the operation amplification is greater than a preset threshold, it can be determined that the current index engine of the data table does not match the current load characteristic of the data table, and the index engine needs to be adjusted. And the type of the current load characteristic can be determined according to the specific type of the operation.
具体而言,操作可以包括读操作和写操作,相应的,操作放大包括读放大和写放大。其中,操作放大是指实际操作的数据的数据量比上需要操作的数据的数据量的比值,读放大是指实际读取到的数据的数据量比上需要读取的数据的数据量,写放大是指实际写入的数据的数据量比上需要写入的数据的数据量。Specifically, the operation may include a read operation and a write operation, and correspondingly, the operation amplification includes read amplification and write amplification. Among them, the operation amplification refers to the ratio of the amount of data actually operated to the amount of data required to be operated, the read amplification refers to the ratio of the amount of data actually read to the amount of data required to be read, and the write amplification refers to the ratio of the amount of data actually written to the amount of data required to be written.
负载特性感知模块113可以感知对数据表进行读操作的读放大。其中,当读放大大于预设的阈值A3时,可以确定数据表当前的索引引擎的结构与当前的负载特性不匹配,以及需要向读友好型索引引擎的方向调整数据表的索引引擎。负载特性感知模块113可以感知对数据表进行写操作的写放大。在一个示例中,当读放大大于预设的阈值A3时,且写放大小于预设的阈值A4时,可以确定数据表当前的索引引擎的结构与数据表当前负载特性不匹配,以及需要向读友好型索引引擎的方向调整数据表的索引引擎。其中,向读友好型索引引擎的方向调整数据表的索引引擎是指使得调整后的索引引擎的结构比调整前的索引引擎的结构,更适合或更匹配读密集型负载。The load characteristic perception module 113 can perceive the read amplification of the read operation on the data table. Among them, when the read amplification is greater than the preset threshold value A3, it can be determined that the structure of the current index engine of the data table does not match the current load characteristics, and the index engine of the data table needs to be adjusted in the direction of the read-friendly index engine. The load characteristic perception module 113 can perceive the write amplification of the write operation on the data table. In an example, when the read amplification is greater than the preset threshold value A3, and the write amplification is less than the preset threshold value A4, it can be determined that the structure of the current index engine of the data table does not match the current load characteristics of the data table, and the index engine of the data table needs to be adjusted in the direction of the read-friendly index engine. Among them, adjusting the index engine of the data table in the direction of the read-friendly index engine means making the structure of the adjusted index engine more suitable or more matching the read-intensive load than the structure of the index engine before the adjustment.
可以根据经验或实验,预先设置阈值A3和阈值A4。在一个例子中,阈值A3可以为20,阈值A4可以为10。在另一个例子中,阈值A3可以为30,阈值A4可以为15。在又一个例子中,阈值A3可以为40,阈值A4可以为20。等等,本申请实施例对阈值A3和阈值A4不做具体限定。The threshold value A3 and the threshold value A4 may be preset based on experience or experiments. In one example, the threshold value A3 may be 20, and the threshold value A4 may be 10. In another example, the threshold value A3 may be 30, and the threshold value A4 may be 15. In yet another example, the threshold value A3 may be 40, and the threshold value A4 may be 20. The present embodiment of the application does not specifically limit the threshold value A3 and the threshold value A4.
其中,负载特性感知模块113可以感知对数据表进行写操作的写放大。当写放大大于预设的阈值A5时,可以确定数据表当前的索引引擎与数据表当前的负载特性不匹配,以及需要向写友好型索引引擎的方向调整数据表的索引引擎。在一个示例中,当写放大大于预设的阈值A5时,且读放大小于预设的阈值A6时,可以确定数据表的当前的索引引擎与数据表当前的负载特性不匹配,以及需要向写友好型索引引擎的方向调整数据表的索引引擎。其中,向写友好型索引引擎的方向调整数据表的索引引擎是指使得调整后的索引引擎的结构比调整前的索引引擎的结构,更适合或更匹配写密集型负载。Among them, the load characteristic perception module 113 can perceive the write amplification of the write operation on the data table. When the write amplification is greater than the preset threshold value A5, it can be determined that the current index engine of the data table does not match the current load characteristics of the data table, and the index engine of the data table needs to be adjusted in the direction of a write-friendly index engine. In one example, when the write amplification is greater than the preset threshold value A5 and the read amplification is less than the preset threshold value A6, it can be determined that the current index engine of the data table does not match the current load characteristics of the data table, and the index engine of the data table needs to be adjusted in the direction of a write-friendly index engine. Among them, adjusting the index engine of the data table in the direction of a write-friendly index engine means making the structure of the adjusted index engine more suitable or more matching to write-intensive loads than the structure of the index engine before the adjustment.
阈值A5和阈值A6可以根据经验或实验,预先设置。在一个例子中,阈值A5可以为20,阈值A6可以为10。在另一个例子中,阈值A5可以为30,阈值A6可以为15。在又一个例子中,阈值A5可以为40,阈值A6可以为20。等等,本申请实施例对阈值A5和阈值A6不做具体限定。Threshold A5 and threshold A6 can be preset based on experience or experiments. In one example, threshold A5 can be 20, and threshold A6 can be 10. In another example, threshold A5 can be 30, and threshold A6 can be 15. In another example, threshold A5 can be 40, and threshold A6 can be 20. And so on. The embodiments of the present application do not specifically limit threshold A5 and threshold A6.
上文示例介绍了存储系统100。接下来,结合存储系统100,示例介绍本申请实施例提供的数据存储方案。其中,可以设定数据表T1对应用户200,数据表T1用于存储用户200的业务的数据。The above example introduces the storage system 100. Next, the data storage solution provided by the embodiment of the present application is introduced by example in combination with the storage system 100. In which, the data table T1 can be set to correspond to the user 200, and the data table T1 is used to store the service data of the user 200.
参阅图4,控制模块110可以执行步骤401,向用户200提供配置接口。其中,配置接口用于供用户输入数据表的负载特性。其中,用户通过配置接口输入的负载特性可以读密集型负载、写密集型负载、或者混合型负载。也就是说,配置接口用于用户配置数据表T1的负载特性是读密集型负载、写密集型负载还是混合型负载。示例性的,在用户不向配置接口输入负载特性的情况下,即在控制模块110接收不到用户输入的负载特性的情况下,控制模块110可以确认用户配置的负载特性为默认特性。在一个例子中,默认特性可以为混合型负载。Referring to Figure 4, the control module 110 can execute step 401 to provide a configuration interface to the user 200. The configuration interface is used for the user to input the load characteristics of the data table. The load characteristics input by the user through the configuration interface can be read-intensive load, write-intensive load, or mixed load. In other words, the configuration interface is used for the user to configure whether the load characteristics of the data table T1 are read-intensive load, write-intensive load, or mixed load. Exemplarily, when the user does not input the load characteristics to the configuration interface, that is, when the control module 110 does not receive the load characteristics input by the user, the control module 110 can confirm that the load characteristics configured by the user are the default characteristics. In one example, the default characteristic can be a mixed load.
用户200可以执行步骤403,输入负载特性E1。其中,负载特性E1具体可以是读密集型负载、写密集型负载、或者混合型负载。The user 200 may execute step 403 to input the load characteristic E1, wherein the load characteristic E1 may specifically be a read-intensive load, a write-intensive load, or a mixed load.
在一些实施例中,存储系统100还可包括位于用户200侧的客户端(未示出)。在步骤401中,控制装置100可以向客户端提供配置接口。用户可以在客户端上进行输入,实现向配置接口输入负载特性E1。其中,前文仅对用户向配置接口输入负载特性的方式进行示例说明,并不构成限定。还可以采用其他现有技术所支持的方式实现用户向配置接口输入负载特性,在此不再一一赘述。In some embodiments, the storage system 100 may further include a client (not shown) located at the user 200 side. In step 401, the control device 100 may provide a configuration interface to the client. The user may input on the client to input the load characteristic E1 to the configuration interface. The above only illustrates the method for the user to input the load characteristic to the configuration interface, and does not constitute a limitation. Other methods supported by the prior art may also be used to implement the user inputting the load characteristic to the configuration interface, which will not be described one by one here.
在一些实施例中,在用户200创建数据表T1时,控制装置110可以向用户200提供配置接口。此时的配置接口用于用户200配置初始化负载特性。也就是说,负载特性E1为初始化负载特性。In some embodiments, when the user 200 creates the data table T1, the control device 110 may provide a configuration interface to the user 200. The configuration interface at this time is used for the user 200 to configure the initialization load characteristic. That is, the load characteristic E1 is the initialization load characteristic.
在一些实施例中,在数据表T1的使用期间,控制装置110可以向用户200提供配置接口。此时的配置接口用于用户200调整负载特性。也就是说,负载特性E1为用户主动调整的负载特性。不难理解,用户可以指示数据表T1在不同的时间段服务于不同的业务。不同的业务所导致的负载特性是不同的。如此,用户可以根据数据表T1所服务的业务的变化,通过配置接口,输入变化后业务所导致的负载特性。即用户可以主动调整负载特性。In some embodiments, during the use of the data table T1, the control device 110 may provide a configuration interface to the user 200. The configuration interface at this time is used for the user 200 to adjust the load characteristics. In other words, the load characteristic E1 is a load characteristic actively adjusted by the user. It is not difficult to understand that the user can instruct the data table T1 to serve different businesses in different time periods. The load characteristics caused by different businesses are different. In this way, the user can input the load characteristics caused by the changed business through the configuration interface according to the changes in the business served by the data table T1. That is, the user can actively adjust the load characteristics.
在一些实施例中,配置接口可以用于用户配置主索引的负载特性和/或二级索引的负载特性。也就是说,负载特性E1可以为主索引的负载特性,也可以为二级索引的负载特性,也可以同时包括主索引的负载特性和二级索引的负载特性。In some embodiments, the configuration interface can be used by the user to configure the load characteristics of the primary index and/or the load characteristics of the secondary index. That is, the load characteristic E1 can be the load characteristics of the primary index, the load characteristics of the secondary index, or the load characteristics of the primary index and the secondary index at the same time.
在一些实施例中,配置接口具体可以为应用程序编程接口(application programming interface,API)。In some embodiments, the configuration interface may specifically be an application programming interface (API).
在一个示例中,配置接口具体用于用户200配置主索引的初始化负载特性,配置接口的函数可以为 InitTableStore(workloadType type)。其中,参数workloadType type在read、write、default这三者中取值。其中,当workloadType type的参数值为read时,负载特性E1为读密集型负载。当workloadType type的参数值为write时,负载特性E1为写密集型负载。当workloadType type的参数值为default时,负载特性E1为混合型负载。In one example, the configuration interface is specifically used for the user 200 to configure the initialization load characteristics of the primary index. The function of the configuration interface can be InitTableStore(workloadType type). The parameter workloadType type can take values from read, write, and default. When the parameter value of workloadType type is read, the load characteristic E1 is a read-intensive load. When the parameter value of workloadType type is write, the load characteristic E1 is a write-intensive load. When the parameter value of workloadType type is default, the load characteristic E1 is a mixed load.
在一个示例中,配置接口具体用于用户200配置二级索引的初始化负载特性,配置接口的函数可以为InitIndexStore(workloadType type)。其中,参数workloadType type在read、write、default这三者中取值。其中,当workloadType type的参数值为read时,负载特性E1为读密集型负载。当workloadType type的参数值为write时,负载特性E1为写密集型负载。当workloadType type的参数值为default时,负载特性E1为混合型负载。In one example, the configuration interface is specifically used for user 200 to configure the initialization load characteristics of the secondary index, and the function of the configuration interface may be InitIndexStore(workloadType type). The parameter workloadType type takes values from read, write, and default. When the parameter value of workloadType type is read, the load characteristic E1 is a read-intensive load. When the parameter value of workloadType type is write, the load characteristic E1 is a write-intensive load. When the parameter value of workloadType type is default, the load characteristic E1 is a mixed load.
在一个示例中,配置接口用于用户200调整负载特性,配置接口的函数可以为ChangeStore(workloadType oldType,workloadType newType)。其中,参数workloadType oldType表示调整前的负载特性,参数workloadType newType表示调整后的负载特性。其中,负载特性E1为调整后的负载特性,即参数workloadType newType表示负载特性E1。其中,参数workloadType oldType和参数workloadType newType均可以read、write、default这三者中取值。如上所述,当取值为read时,负载特性为读密集型负载。当取值为write时,负载特性为写密集型负载。当取值为default时,负载特性E1为混合型负载。In one example, the configuration interface is used by user 200 to adjust the load characteristics, and the function of the configuration interface may be ChangeStore(workloadType oldType, workloadType newType). The parameter workloadType oldType represents the load characteristics before adjustment, and the parameter workloadType newType represents the load characteristics after adjustment. The load characteristic E1 is the load characteristic after adjustment, that is, the parameter workloadType newType represents the load characteristic E1. The parameters workloadType oldType and workloadType newType can both take values from read, write, and default. As described above, when the value is read, the load characteristic is a read-intensive load. When the value is write, the load characteristic is a write-intensive load. When the value is default, the load characteristic E1 is a mixed load.
继续参阅图4,控制装置110可以执行步骤405,确定与负载特性E1匹配的索引引擎E11。其中,当负载特性E1为读密集型负载时,则索引引擎E11为读友好型索引引擎。当负载特性E1为写密集型负载时,则索引引擎E11为写友好型索引引擎。当负载特性E1为混合型负载时,则索引引擎E11为混合型索引引擎。其中,混合型索引引擎的结构介于读友好型索引引擎的结构和写友好型索引引擎的结构之间。Continuing to refer to FIG. 4 , the control device 110 may execute step 405 to determine an index engine E11 that matches the load characteristic E1. When the load characteristic E1 is a read-intensive load, the index engine E11 is a read-friendly index engine. When the load characteristic E1 is a write-intensive load, the index engine E11 is a write-friendly index engine. When the load characteristic E1 is a mixed load, the index engine E11 is a mixed index engine. The structure of the mixed index engine is between the structure of the read-friendly index engine and the structure of the write-friendly index engine.
在一些实施例中,可以基于B+树结构和日志结构合并树结构来构建索引引擎。其中,读友好型索引引擎至少包括B+树结构,写友好型索引引擎至少包括日志结构合并树结构。其中,混合型索引引擎可以同时包括B+树结构和日志结构合并树结构,其中,B+树结构位于日志结构合并树结构的下层。In some embodiments, an index engine may be constructed based on a B+ tree structure and a log structure merge tree structure. The read-friendly index engine includes at least a B+ tree structure, and the write-friendly index engine includes at least a log structure merge tree structure. The hybrid index engine may include both a B+ tree structure and a log structure merge tree structure, wherein the B+ tree structure is located at the lower layer of the log structure merge tree structure.
其中,在图5中,从左到右,依次示出了多种索引引擎,其中,这多种索引引擎的读性能从左到右依次增强,写性能从右到左依次增强。其中,最左侧的索引引擎可以作为读友好型索引引擎,最右侧的索引引擎可以作为写友好型索引引擎,中间的索引引擎可以作为混合型索引引擎。In FIG5 , from left to right, multiple index engines are shown in sequence, wherein the read performance of the multiple index engines is enhanced from left to right, and the write performance is enhanced from right to left. The leftmost index engine can be used as a read-friendly index engine, the rightmost index engine can be used as a write-friendly index engine, and the middle index engine can be used as a hybrid index engine.
在一个说明性示例中,如图5所示,读友好型索引引擎具体为B+树结构。写友好型索引引擎由B+树结构和日志结构合并树结构组成,其中,B+树结构位于日志结构合并树结构的下层。混合型索引引擎也由B+树结构和日志结构合并树结构组成,且B+树结构位于日志结构合并树结构的下层。其中,与写友好型索引引擎相比,混合型索引引擎中的日志结构合并树结构的存储层的层数少。也就是说,写友好型索引引擎中的日志结构合并树结构具有层数较多的存储层,混合型索引引擎中的日志结构合并树结构具有层数较少的存储层。In an illustrative example, as shown in FIG5 , the read-friendly index engine is specifically a B+ tree structure. The write-friendly index engine is composed of a B+ tree structure and a log-structured merge tree structure, wherein the B+ tree structure is located at the lower layer of the log-structured merge tree structure. The hybrid index engine is also composed of a B+ tree structure and a log-structured merge tree structure, and the B+ tree structure is located at the lower layer of the log-structured merge tree structure. Among them, compared with the write-friendly index engine, the log-structured merge tree structure in the hybrid index engine has fewer storage layers. That is to say, the log-structured merge tree structure in the write-friendly index engine has a storage layer with a larger number of layers, and the log-structured merge tree structure in the hybrid index engine has a storage layer with a smaller number of layers.
其中,如上所述,在数据写入时,先将数据写入到最上层,即C1层。当C1层的数据量达到预设值D1时,将C1层的数据和C1层的下一层(即C2层)的数据合并(compaction),并将合并后的数据移交到C2层。当C2层的数据量达到预设值D2时,将C2层的数据和C2层的下一层(即C3层)的数据合并,并将合并后的数据移交到C3层,依次类推。因此,日志结构合并树结构中的存储层的层数越多,B+树上层的数据聚合效果越好,写入到在B+数结构中的数据越少,因此,可以减少写放大,降低写时延。As mentioned above, when writing data, the data is first written to the top layer, that is, the C1 layer. When the amount of data in the C1 layer reaches the preset value D1, the data in the C1 layer and the data in the next layer of the C1 layer (that is, the C2 layer) are merged (compaction), and the merged data is transferred to the C2 layer. When the amount of data in the C2 layer reaches the preset value D2, the data in the C2 layer and the data in the next layer of the C2 layer (that is, the C3 layer) are merged, and the merged data is transferred to the C3 layer, and so on. Therefore, the more storage layers there are in the log structure merge tree structure, the better the data aggregation effect of the upper layer of the B+ tree, and the less data is written to the B+ tree structure, so write amplification can be reduced and write latency can be reduced.
而日志结构合并树结构中的存储层的层数越多,数据的历史版本可能越多,这导致读放大,增加读时延。因此,为了减少读放大,降低读时延,需要减少日志结构合并树结构的存储层的层数。如此,通过调整日志结构合并树结构中存储层的层数,构建不同读写性能的索引引擎,即构建读友好型索引引擎、写友好型索引引擎以及混合型索引引擎。其中,混合型索引引擎中的日志结构合并树结构的存储层的数量可以减少或增加,从而使得混合型索引引擎偏向于读友好型或写友好型。The more storage layers there are in the log-structured merge tree structure, the more historical versions of the data may be, which leads to read amplification and increases read latency. Therefore, in order to reduce read amplification and lower read latency, it is necessary to reduce the number of storage layers in the log-structured merge tree structure. In this way, by adjusting the number of storage layers in the log-structured merge tree structure, index engines with different read and write performances are constructed, that is, a read-friendly index engine, a write-friendly index engine, and a hybrid index engine are constructed. Among them, the number of storage layers of the log-structured merge tree structure in the hybrid index engine can be reduced or increased, so that the hybrid index engine tends to be read-friendly or write-friendly.
回到图4,当在步骤405确定出索引引擎E11时,控制装置110可以执行步骤407,指示存储装置120按照索引引擎E11存储数据表T1中的数据。具体地,如图4所示,控制装置110可以执行步骤4071,向存储装置120发送指示信息,其中,指示信息可以包括索引引擎E11的标识。存储装置120可以执行步骤4072,响应该指示信息,按照索引引擎E11存储数据表T1中的数据。Returning to FIG. 4 , when the index engine E11 is determined in step 405, the control device 110 may execute step 407 to instruct the storage device 120 to store the data in the data table T1 according to the index engine E11. Specifically, as shown in FIG. 4 , the control device 110 may execute step 4071 to send instruction information to the storage device 120, wherein the instruction information may include the identifier of the index engine E11. The storage device 120 may execute step 4072 to store the data in the data table T1 according to the index engine E11 in response to the instruction information.
其中,当负载特性E1是主索引的负载特性时,则控制装置110指示存储装置120按照索引引擎E11存储整个数据表T1中的数据,或者存储该主索引所对应的数据分区实例中的数据。具体而言,指示信息在包括索引引擎E11的标识的同时,可以包括数据表T1的标识或数据分区实例的标识,由此,存储装置 120根据数据表T1的标识或数据分区实例的标识,按照索引引擎E11存储整个数据表T1中的数据,或者存储该主索引所对应的数据分区实例中的数据。When the load characteristic E1 is the load characteristic of the primary index, the control device 110 instructs the storage device 120 to store the data in the entire data table T1 according to the index engine E11, or to store the data in the data partition instance corresponding to the primary index. Specifically, the instruction information may include the identifier of the data table T1 or the identifier of the data partition instance while including the identifier of the index engine E11. Thus, the storage device 120 stores the data in the entire data table T1 or the data partition instance according to the identifier of the data table T1 or the identifier of the data partition instance according to the index engine E11, or stores the data in the data partition instance corresponding to the primary index.
当负载特性E1是二级索引的负载特性时,则控制控制110指示存储装置120按照索引引擎E11存储该二级索引下的数据。具体而言,指示信息在包括索引引擎E11的标识的同时,可以包括二级索引的标识,由此,存储装置120根据二级索引的标识,按照索引引擎E11存储该二级索引下的数据。When the load characteristic E1 is the load characteristic of the secondary index, the control 110 instructs the storage device 120 to store the data under the secondary index according to the index engine E11. Specifically, the instruction information may include the identifier of the secondary index while including the identifier of the index engine E11, so that the storage device 120 stores the data under the secondary index according to the index engine E11 according to the identifier of the secondary index.
在一些实施例中,执行步骤407之前,数据表T1的索引引擎为索引引擎E12。其中,索引引擎E12为读友好型索引引擎,索引引擎E11为写友好型索引引擎;或者,索引引擎E11为读友好型索引引擎,索引引擎E12为写友好型索引引擎。在这种情况下,在步骤407中,控制装置110可以指示存储装置120先按照混合型索引引擎存储数据表T1中的数据,然后,再按照索引引擎E11存储数据表T1中的数据。存储装置120可以先将数据表T1的索引引擎由索引引擎E12变化到混合型索引引擎,然后,在将数据表T1的索引引擎由混合型索引引擎变化到索引引擎E1。如此,将存储数据的索引引擎从索引引擎E12逐步迁移到索引引擎E11,可以降低索引引擎的变化开销。In some embodiments, before executing step 407, the index engine of data table T1 is index engine E12. Among them, index engine E12 is a read-friendly index engine, and index engine E11 is a write-friendly index engine; or, index engine E11 is a read-friendly index engine, and index engine E12 is a write-friendly index engine. In this case, in step 407, the control device 110 may instruct the storage device 120 to first store the data in data table T1 according to the hybrid index engine, and then store the data in data table T1 according to index engine E11. The storage device 120 may first change the index engine of data table T1 from index engine E12 to a hybrid index engine, and then change the index engine of data table T1 from a hybrid index engine to index engine E1. In this way, the index engine for storing data is gradually migrated from index engine E12 to index engine E11, which can reduce the overhead of changing the index engine.
在本申请实施例提供的数据存储方案中,用户可以配置其数据表的索引引擎,从而用户可以在数据表所服务的业务发生变化时,可以及时或随时调整数据表的索引引擎,使得索引引擎与变化后业务所导致的负载特性相匹配,从而可以提高数据表的访问性能。In the data storage solution provided in the embodiment of the present application, the user can configure the index engine of his data table, so that when the business served by the data table changes, the user can adjust the index engine of the data table in a timely manner or at any time, so that the index engine matches the load characteristics caused by the changed business, thereby improving the access performance of the data table.
结合图2所示的存储系统100,本申请实施例还提供了一种数据存储方案。接下来,示例介绍该方案。In conjunction with the storage system 100 shown in Fig. 2, the embodiment of the present application further provides a data storage solution. Next, the solution is introduced by way of example.
如图6所示,控制装置110可以执行步骤601,监测数据表T1的操作放大。示例性的,步骤601可以周期性执行。其中,可以将在一个执行周期中监测到的操作放大的平均值,作为该执行周期的操作放大。步骤601的执行周期可以是预设的。在一个例子中,步骤601的执行周期可以为10分钟。在另一个例子中,步骤601的执行周期可以为20分钟。等等。As shown in FIG6 , the control device 110 may perform step 601 to monitor the operation amplification of the data table T1. Exemplarily, step 601 may be performed periodically. The average value of the operation amplification monitored in an execution cycle may be used as the operation amplification of the execution cycle. The execution cycle of step 601 may be preset. In one example, the execution cycle of step 601 may be 10 minutes. In another example, the execution cycle of step 601 may be 20 minutes. And so on.
其中,操作放大可以包括从数据表T1读取数据的读放大或者向数据表T1写入数据的写放大。操作放大也可以同时包括从数据表T1读取数据的读放大和/或向数据表T1写入数据的写放大。The operation amplification may include read amplification of data read from the data table T1 or write amplification of data written to the data table T1. The operation amplification may also include read amplification of data read from the data table T1 and/or write amplification of data written to the data table T1.
另外,在下文描述中,如无特殊说明,读放大是指从数据表T1读取数据的读放大,写放大是指向数据表T1写入数据的写放大。In addition, in the following description, unless otherwise specified, read amplification refers to read amplification of reading data from the data table T1, and write amplification refers to write amplification of writing data to the data table T1.
继续参阅图6,控制装置100可以执行步骤603,判断操作放大是否大于阈值y1。Continuing to refer to FIG. 6 , the control device 100 may execute step 603 to determine whether the operation amplification is greater than the threshold y1 .
在一些实施例中,操作放大包括读放大,阈值y1包括阈值A3。在步骤603,可以判断读放大是否大于阈值A3。若读放大大于阈值A3,则可以确定数据表T1当前的索引引擎与数据表T1当前的负载特性不匹配,以及需要向读友好型索引引擎的方向,调整数据表T1的索引引擎。In some embodiments, the operation amplification includes read amplification, and the threshold y1 includes a threshold A3. In step 603, it can be determined whether the read amplification is greater than the threshold A3. If the read amplification is greater than the threshold A3, it can be determined that the current index engine of the data table T1 does not match the current load characteristics of the data table T1, and the index engine of the data table T1 needs to be adjusted in the direction of a read-friendly index engine.
在该实施例的一个说明性示例中,操作放大包括读放大和写放大,阈值y1包括阈值A3和阈值A4。在步骤603,可以判断读放大是否大于阈值A3,以及判断写放大是否小于阈值A4。若读放大大于阈值A3,且写放大小于阈值A4,则可以确定数据表T1当前的索引引擎与数据表T1当前的负载特性不匹配,以及需要向读友好型索引引擎的方向,调整数据表T1的索引引擎。In an illustrative example of this embodiment, operation amplification includes read amplification and write amplification, and threshold y1 includes threshold A3 and threshold A4. In step 603, it can be determined whether the read amplification is greater than threshold A3, and whether the write amplification is less than threshold A4. If the read amplification is greater than threshold A3, and the write amplification is less than threshold A4, it can be determined that the current index engine of data table T1 does not match the current load characteristics of data table T1, and it is necessary to adjust the index engine of data table T1 in the direction of a read-friendly index engine.
其中,阈值A3、阈值A4具体可以参考上文介绍,在此不再赘述。The threshold values A3 and A4 may be specifically described above and will not be described in detail here.
在一些实施例中,操作放大包括写放大,阈值y1包括阈值A5。在步骤603,可以判断写放大是否大于阈值A5。若写放大大于阈值A5,则可以确定数据表T1当前的索引引擎与数据表T1当前的负载特性不匹配,以及需要向写友好型索引引擎的方向,调整数据表T1的索引引擎。In some embodiments, the operation amplification includes write amplification, and the threshold y1 includes a threshold A5. In step 603, it can be determined whether the write amplification is greater than the threshold A5. If the write amplification is greater than the threshold A5, it can be determined that the current index engine of the data table T1 does not match the current load characteristics of the data table T1, and the index engine of the data table T1 needs to be adjusted in the direction of a write-friendly index engine.
在该实施例的一个说明性示例中,操作放大包括写放大和读放大,阈值y1包括阈值A5和阈值A6。在步骤603,可以判断写放大是否大于阈值A5,以及判断读放大是否小于阈值A6。若写放大大于阈值A5,且读放大小于阈值A6,则可以确定数据表T1当前的索引引擎与数据表T1当前的负载特性不匹配,以及需要向写友好型索引引擎的方向,调整数据表T1的索引引擎。In an illustrative example of this embodiment, operation amplification includes write amplification and read amplification, and threshold y1 includes threshold A5 and threshold A6. In step 603, it can be determined whether the write amplification is greater than threshold A5, and whether the read amplification is less than threshold A6. If the write amplification is greater than threshold A5, and the read amplification is less than threshold A6, it can be determined that the current index engine of data table T1 does not match the current load characteristics of data table T1, and it is necessary to adjust the index engine of data table T1 in the direction of a write-friendly index engine.
其中,阈值A5、阈值A6具体可以参考上文介绍,在此不再赘述。The specific details of the threshold A5 and the threshold A6 can be referred to the above description, which will not be described again here.
继续参阅图6,控制装置110可以执行步骤605,向降低操作放大的方向,将数据表T1的索引引擎调整为索引引擎E21。Continuing to refer to FIG. 6 , the control device 110 may execute step 605 to adjust the index engine of the data table T1 to the index engine E21 in the direction of reducing the operation amplification.
具体而言,当在步骤603中确定出需要向读友好型索引引擎的方向,调整数据表T1的索引引擎时,则索引引擎E21是比数据表T1当前的索引引擎更适合或更有利于读操作的索引引擎,即索引引擎E21与读密集型负载的匹配度大于数据表T1当前的索引引擎与读密集型负载的匹配度。也就是说,按照索引引擎E21存储数据表T1的数据时的数据表T1的读放大小于数据表T1当前的读放大。Specifically, when it is determined in step 603 that the index engine of data table T1 needs to be adjusted in the direction of a read-friendly index engine, then index engine E21 is an index engine that is more suitable or more conducive to read operations than the current index engine of data table T1, that is, the matching degree between index engine E21 and the read-intensive load is greater than the matching degree between the current index engine of data table T1 and the read-intensive load. In other words, the read amplification of data table T1 when the data of data table T1 is stored according to index engine E21 is less than the current read amplification of data table T1.
在一些实施例中,如上所述,可以通过调整索引引擎中的日志结构合并树的存储层的数量,从而使得索引引擎更偏向于读友好型或写友好型。其中,当索引引擎中的日志结构合并树的存储层的数量减少时, 索引引擎更偏向于读友好。如此,当在步骤603中确定出需要向读友好型索引引擎的方向,调整数据表T1的索引引擎时,可以将比数据表T1当前的索引引擎的结构少N个存储层的索引引擎,作为索引引擎E21。也就是说,索引引擎E21的结构比数据表T1当前的索引引擎的结构少N个存储层。其中,存储层是指日志结构合并树的存储层,N为大于或等于1的整数。其中,N的值可以预设。在一个例子中,N为1、2或3等。In some embodiments, as described above, the number of storage layers of the log structure merge tree in the index engine can be adjusted to make the index engine more read-friendly or write-friendly. The index engine is more read-friendly. In this way, when it is determined in step 603 that the index engine of data table T1 needs to be adjusted in the direction of a read-friendly index engine, an index engine with N fewer storage layers than the current index engine structure of data table T1 can be used as index engine E21. In other words, the structure of index engine E21 has N fewer storage layers than the current index engine structure of data table T1. The storage layer refers to the storage layer of the log structure merge tree, and N is an integer greater than or equal to 1. The value of N can be preset. In one example, N is 1, 2, or 3, etc.
当在步骤603中确定出需要向写友好型索引引擎的方向,调整数据表T1的索引引擎时,则索引引擎E21是比数据表T1当前的索引引擎更适合或更有利于写操作的索引引擎,即索引引擎E21与写密集型负载的匹配度大于数据表T1当前的索引引擎与写密集型负载的匹配度。也就是说,按照索引引擎E21存储数据表T1的数据时的数据表T1的写放大小于数据表T1当前的写放大。When it is determined in step 603 that the index engine of data table T1 needs to be adjusted in the direction of a write-friendly index engine, then index engine E21 is an index engine that is more suitable or more conducive to write operations than the current index engine of data table T1, that is, the matching degree between index engine E21 and the write-intensive load is greater than the matching degree between the current index engine of data table T1 and the write-intensive load. In other words, the write amplification of data table T1 when the data of data table T1 is stored according to index engine E21 is less than the current write amplification of data table T1.
在一些实施例中,如上所述,可以通过调整索引引擎中的日志结构合并树的存储层的数量,从而使得索引引擎更偏向于读友好型或写友好型。其中,当索引引擎中的日志结构合并树的存储层的数量增加时,索引引擎更偏向于写友好。如此,当在步骤603中确定出需要向写友好型索引引擎的方向,调整数据表T1的索引引擎时,可以将比数据表T1当前的索引引擎的结构多M个存储层的索引引擎,作为索引引擎E21。也就是说,相比数据表T1当前的索引引擎,索引引擎E21的结构比数据表T1当前的索引引擎的结构多出M个存储层。其中,存储层是指日志结构合并树的存储层,M为大于或等于1的整数。其中,M个值可以预设。在一个例子中,M为1、2或3等。In some embodiments, as described above, the index engine can be made more read-friendly or write-friendly by adjusting the number of storage layers of the log-structured merge tree in the index engine. Among them, when the number of storage layers of the log-structured merge tree in the index engine increases, the index engine is more write-friendly. In this way, when it is determined in step 603 that the index engine of data table T1 needs to be adjusted in the direction of a write-friendly index engine, an index engine with M more storage layers than the structure of the current index engine of data table T1 can be used as index engine E21. That is, compared with the current index engine of data table T1, the structure of index engine E21 has M more storage layers than the structure of the current index engine of data table T1. Among them, the storage layer refers to the storage layer of the log-structured merge tree, and M is an integer greater than or equal to 1. Among them, the M values can be preset. In one example, M is 1, 2, or 3, etc.
控制装置110还可以执行步骤607,指示存储装置120按照索引引擎E21存储数据表T1中的数据。The control device 110 may also execute step 607 to instruct the storage device 120 to store the data in the data table T1 according to the index engine E21.
具体地,如图6所示,控制装置110可以执行步骤6071,向存储装置120发送指示信息,其中,指示信息可以包括索引引擎E21的标识。存储装置120可以执行步骤6072,响应该指示信息,按照索引引擎E21存储数据表T1中的数据。6, the control device 110 may execute step 6071 to send indication information to the storage device 120, wherein the indication information may include the identifier of the index engine E21. The storage device 120 may execute step 6072 to store the data in the data table T1 according to the index engine E21 in response to the indication information.
其中,当步骤601监测到的操作放大为主索引的操作放大时,则控制装置110指示存储装置120按照索引引擎E21存储整个数据表T1中的数据,或者存储该主索引所对应的数据分区实例中的数据。具体而言,指示信息在包括索引引擎E21的标识的同时,还可以包括数据表T1的标识或数据分区实例的标识,由此,存储装置120可以根据数据表T1的标识或数据分区实例的标识,按照索引引擎E21存储整个数据表T1中的数据,或者存储该主索引所对应的数据分区实例中的数据。Among them, when the operation amplification monitored in step 601 is the operation amplification of the primary index, the control device 110 instructs the storage device 120 to store the data in the entire data table T1 according to the index engine E21, or to store the data in the data partition instance corresponding to the primary index. Specifically, the instruction information may include the identifier of the data table T1 or the identifier of the data partition instance while including the identifier of the index engine E21, so that the storage device 120 may store the data in the entire data table T1 according to the identifier of the data table T1 or the identifier of the data partition instance, or to store the data in the data partition instance corresponding to the primary index according to the index engine E21.
当步骤601监测到的操作放大为数据表T1的本地二级索引的操作放大时,则控制控制110指示存储装置120按照索引引擎E21存储该本地二级索引下的数据。具体而言,指示信息在包括索引引擎E21的标识的同时,还可以包括本地二级索引的标识,由此,存储装置120根据本地二级索引的标识,按照索引引擎E21存储该本地二级索引下的数据。When the operation amplification monitored in step 601 is the operation amplification of the local secondary index of the data table T1, the control 110 instructs the storage device 120 to store the data under the local secondary index according to the index engine E21. Specifically, the instruction information may include the identifier of the local secondary index as well as the identifier of the index engine E21, so that the storage device 120 stores the data under the local secondary index according to the index engine E21 according to the identifier of the local secondary index.
继续参阅图6,步骤607之后,控制装置110可以再次执行步骤601、以及步骤603。其中,当操作放大不大于阈值y1时,可以停止调整数据表T1的索引引擎。当操作放大大于阈值y1时,可以再次执行步骤605、步骤607。具体可以参考上文介绍,在此不再赘述。Continuing to refer to FIG. 6 , after step 607, the control device 110 may execute step 601 and step 603 again. When the operation amplification is not greater than the threshold value y1, the index engine for adjusting the data table T1 may be stopped. When the operation amplification is greater than the threshold value y1, steps 605 and 607 may be executed again. For details, please refer to the above description, which will not be repeated here.
如此,通过步骤601-步骤607的迭代执行,可以动态调整数据表T1的索引引擎,使得数据表T1的索引引擎尽可能与数据表T1的负载特性匹配,降低操作放大。In this way, through iterative execution of step 601 to step 607 , the index engine of data table T1 can be dynamically adjusted so that the index engine of data table T1 matches the load characteristics of data table T1 as much as possible, thereby reducing operation amplification.
本申请实施例提供的数据存储方案,可以感知数据表的操作放大的动态变化,并根据操作放大的动态变化,动态调整数据表的索引引擎,使得索引引擎与数据表的负载特性相匹配,从而可以提高数据表的访问性能。The data storage solution provided in the embodiment of the present application can sense the dynamic changes of the operation amplification of the data table, and dynamically adjust the index engine of the data table according to the dynamic changes of the operation amplification, so that the index engine matches the load characteristics of the data table, thereby improving the access performance of the data table.
结合图2所示的存储系统100,本申请实施例还提供了一种数据存储方案。接下来,示例介绍该方案。In conjunction with the storage system 100 shown in Fig. 2, the embodiment of the present application further provides a data storage solution. Next, the solution is introduced by way of example.
如图7所示,控制装置110可以执行步骤701,监测数据表T1的数据表T1的读写操作。示例性的,步骤601可以周期性执行。其中,读写操作包括读操作和写操作。在步骤701,可以监测在一个执行周期内读操作的总次数和写操作的总次数,得到该执行周期的监测结果。即监测结果包括监测周期内数据表T1读操作的总次数和写操作的总次数。执行周期可以预设。在一个例子中,步骤701的执行周期可以为1个小时。在另一个例子中,步骤701的执行周期可以为两个小时。等等。As shown in Figure 7, the control device 110 can execute step 701 to monitor the read and write operations of data table T1 of data table T1. Exemplarily, step 601 can be executed periodically. Among them, the read and write operations include read operations and write operations. In step 701, the total number of read operations and the total number of write operations in an execution cycle can be monitored to obtain the monitoring result of the execution cycle. That is, the monitoring result includes the total number of read operations and the total number of write operations of data table T1 in the monitoring cycle. The execution cycle can be preset. In one example, the execution cycle of step 701 can be 1 hour. In another example, the execution cycle of step 701 can be two hours. And so on.
控制装置110可以执行步骤703,根据监测结果,得到负载特性E3。其中,当读操作的总次数比上写操作的总次数的比值大于阈值A1时,将读密集型负载作为负载特性E3。当写操作的总次数比上读操作的总次数的比值大于阈值A2时,将写密集型负载作为负载特性E3。其中,阈值A1和阈值A2具体可以参考上文介绍,在此不再赘述。The control device 110 may execute step 703 and obtain the load characteristic E3 according to the monitoring result. When the ratio of the total number of read operations to the total number of write operations is greater than the threshold value A1, the read-intensive load is used as the load characteristic E3. When the ratio of the total number of write operations to the total number of read operations is greater than the threshold value A2, the write-intensive load is used as the load characteristic E3. For the specifics of the threshold values A1 and A2, please refer to the above description and will not be repeated here.
接着,控制装置110可以执行步骤705,向与负载特性E3匹配的索引引擎迁移,得到索引引擎E31。Next, the control device 110 may execute step 705 to migrate to an indexing engine matching the load characteristic E3 to obtain an indexing engine E31.
其中,当负载特性E3为读密集型负载时,索引引擎E31的结构比数据表T1当前的索引引擎的结构少 N个存储层。其中,存储层是指日志结构合并树的存储层,N为大于或等于1的整数。其中,N的值可以预设。在一个例子中,N为1、2或3等。When the load characteristic E3 is a read-intensive load, the structure of the index engine E31 is less than the structure of the current index engine of the data table T1. N storage layers. The storage layer refers to the storage layer of the log structure merge tree, and N is an integer greater than or equal to 1. The value of N can be preset. In one example, N is 1, 2, or 3, etc.
当负载特性E3为写密集型负载时,索引引擎E31的结构比数据表T1当前的索引引擎的结构多M个存储层。其中,存储层是指日志结构合并树的存储层,M为大于或等于1的整数。其中,M的值可以预设。在一个例子中,M为1、2或3等。When the load characteristic E3 is a write-intensive load, the structure of the index engine E31 has M more storage layers than the structure of the current index engine of the data table T1. The storage layer refers to the storage layer of the log structure merge tree, and M is an integer greater than or equal to 1. The value of M can be preset. In one example, M is 1, 2, or 3, etc.
然后,控制装置110可以执行步骤707,指示存储装置120按照索引引擎E31存储数据表T1中的数据。其中,步骤707可以包括步骤7071,向存储装置发送指示信息。步骤707还可以包括步骤7072,按照索引引擎E31,存储数据表T1中的数据。具体可以参考上文对图4中步骤407以及步骤4071、步骤4072的介绍实现,在此不再赘述。Then, the control device 110 may execute step 707 to instruct the storage device 120 to store the data in the data table T1 according to the index engine E31. Among them, step 707 may include step 7071, sending instruction information to the storage device. Step 707 may also include step 7072, storing the data in the data table T1 according to the index engine E31. For details, please refer to the above introduction to step 407, step 4071, and step 4072 in Figure 4, which will not be repeated here.
本申请实施例提供的数据存储方案,可以感知数据表负载特性的动态变化,并根据负载特性的动态变化,动态调整数据表的索引引擎,使得索引引擎与数据表的负载特性相匹配,从而可以提高数据表的访问性能。The data storage solution provided in the embodiment of the present application can sense the dynamic changes in the load characteristics of the data table, and dynamically adjust the index engine of the data table according to the dynamic changes in the load characteristics, so that the index engine matches the load characteristics of the data table, thereby improving the access performance of the data table.
基于上文所描述的数据存储方案,本申请实施例提供了一种数据存储方法。可以理解的是,该方法与上述所描述的数据存储方案是相结合的,该方法中的相关步骤的具体执行过程可以参考数据存储方案中的相应步骤的执行过程。Based on the data storage solution described above, the embodiment of the present application provides a data storage method. It can be understood that the method is combined with the data storage solution described above, and the specific execution process of the relevant steps in the method can refer to the execution process of the corresponding steps in the data storage solution.
该方法应用于存储系统中的控制装置(例如存储系统100中的控制装置110),存储系统还包括存储装置(例如存储系统100中的存储装置120),存储装置存储有用户的数据表。如图8所示,该方法包括如下步骤。The method is applied to a control device in a storage system (eg, control device 110 in storage system 100), the storage system further includes a storage device (eg, storage device 120 in storage system 100), and the storage device stores a user data table. As shown in FIG8 , the method includes the following steps.
步骤801,提供配置接口,所述配置接口用于供所述用户配置所述数据表的负载特性为读密集型负载或者写密集型负载。具体可以参考上文对图4中步骤401的介绍实现。Step 801, providing a configuration interface, wherein the configuration interface is used for the user to configure the load characteristics of the data table as read-intensive load or write-intensive load. For details, please refer to the above description of step 401 in FIG. 4 .
步骤803a,当所述配置接口指示所述用户配置所述数据表的负载特性为读密集型负载时,指示所述存储装置按照第一索引引擎,存储所述数据表中的数据;所述第一索引引擎与所述读密集型负载匹配。具体可以参考上文对图4中步骤403-步骤407的介绍实现。Step 803a, when the configuration interface indicates that the user configures the load characteristic of the data table as a read-intensive load, instruct the storage device to store the data in the data table according to the first index engine; the first index engine matches the read-intensive load. For details, please refer to the above description of steps 403 to 407 in Figure 4.
步骤803b,当所述配置接口指示所述用户配置所述数据表的负载特性为写密集型负载时,指示所述存储装置按照第二索引引擎,存储所述数据表中的数据;所述第二索引引擎与所述写密集型负载匹配。具体可以参考上文对图4中步骤403-步骤407的介绍实现。Step 803b, when the configuration interface indicates that the user configures the load characteristic of the data table as a write-intensive load, instruct the storage device to store the data in the data table according to the second index engine; the second index engine matches the write-intensive load. For details, please refer to the above description of steps 403 to 407 in Figure 4.
在一些实施例中,所述第一索引引擎至少包括B+树结构,所述第二索引引擎至少包括日志结构合并树结构。In some embodiments, the first indexing engine includes at least a B+ tree structure, and the second indexing engine includes at least a log structure merge tree structure.
在一些实施例中,在所述指示所述存储装置按照第一索引引擎,存储所述数据表中的数据之前,所述数据表中的索引引擎是所述第二索引引擎;所述指示所述存储装置按照第一索引引擎,存储所述数据表中的数据,包括:指示所述存储装置先将所述数据表中的索引引擎从所述第二索引引擎迁移为第三索引引擎,所述第三索引引擎的结构介于所述第一索引引擎的结构和所述第二索引引擎的结构之间;再指示所述存储装置将所述数据表的索引引擎从所述第三索引引擎迁移为所述第一索引引擎。具体可以参考上文对图4中步骤407的介绍实现。In some embodiments, before instructing the storage device to store the data in the data table according to the first index engine, the index engine in the data table is the second index engine; instructing the storage device to store the data in the data table according to the first index engine includes: instructing the storage device to first migrate the index engine in the data table from the second index engine to the third index engine, the structure of the third index engine being between the structure of the first index engine and the structure of the second index engine; and then instructing the storage device to migrate the index engine of the data table from the third index engine to the first index engine. For details, please refer to the above description of step 407 in Figure 4.
在一些实施例中,所述数据表具有本地二级索引,所述配置接口还用于供所述用户配置所述本地二级索引的负载特性为读密集型负载或者写密集型负载;当所述配置接口指示所述用户配置所述本地二级索引的负载特性为读密集型负载时,指示所述存储装置按照所述第一索引引擎,存储所述本地二级索引下的数据;当所述配置接口指示所述用户配置所述本地二级索引的负载特性为写密集型负载时,指示所述存储装置按照第二索引引擎,存储所述本地二级索引下的数据。具体可以参考上文对图4中步骤407的介绍实现。In some embodiments, the data table has a local secondary index, and the configuration interface is also used for the user to configure the load characteristics of the local secondary index as a read-intensive load or a write-intensive load; when the configuration interface instructs the user to configure the load characteristics of the local secondary index as a read-intensive load, the storage device is instructed to store the data under the local secondary index according to the first index engine; when the configuration interface instructs the user to configure the load characteristics of the local secondary index as a write-intensive load, the storage device is instructed to store the data under the local secondary index according to the second index engine. For details, please refer to the above description of step 407 in Figure 4.
通过本申请实施例提供的数据存储方法,用户可以配置其数据表的索引引擎,从而用户可以在数据表所服务的业务发生变化时,可以及时或随时调整数据表的索引引擎,使得索引引擎与变化后业务所导致的负载特性相匹配,从而可以提高数据表的访问性能。Through the data storage method provided in the embodiment of the present application, users can configure the index engine of their data table, so that when the business served by the data table changes, the user can adjust the index engine of the data table in a timely manner or at any time, so that the index engine matches the load characteristics caused by the changed business, thereby improving the access performance of the data table.
本申请实施例还提供了一种数据存储方法,该方法应用于存储系统中的控制装置(例如存储系统100中的控制装置110),存储系统还包括存储装置(例如存储系统100中的存储装置120),存储装置存储有数据表。如图9所示,该方法包括如下步骤。The embodiment of the present application also provides a data storage method, which is applied to a control device in a storage system (e.g., the control device 110 in the storage system 100), wherein the storage system further includes a storage device (e.g., the storage device 120 in the storage system 100), and the storage device stores a data table. As shown in FIG9 , the method includes the following steps.
步骤901,所述控制装置在所述数据表的索引引擎为第四索引引擎的情况下,监测所述数据表的操作放大,所述操作放大包括从所述数据表读取数据的读放大或者向所述数据表写入数据的写放大。具体可以参考上文对图6中步骤601的介绍实现。In step 901, the control device monitors the operation amplification of the data table when the index engine of the data table is the fourth index engine, and the operation amplification includes read amplification of reading data from the data table or write amplification of writing data to the data table. For details, please refer to the above description of step 601 in FIG. 6.
步骤903a,当所述操作放大包括所述读放大,且所述读放大大于第一阈值时,指示所述存储装置按照 第五索引引擎,存储所述数据表中的数据;其中,所述第五索引引擎与读密集型负载的匹配度大于所述第四索引引擎与所述读密集型负载的匹配度。具体可以参考上文对图6中步骤603-步骤607的介绍实现。Step 903a, when the operation amplification includes the read amplification, and the read amplification is greater than a first threshold, instructing the storage device to The fifth index engine stores the data in the data table; wherein the matching degree between the fifth index engine and the read-intensive load is greater than the matching degree between the fourth index engine and the read-intensive load. For details, please refer to the above description of steps 603 to 607 in FIG. 6 .
步骤903b,当所述操作放大包括所述写放大,且所述写放大大于第二阈值时,指示所述存储装置按照第六索引引擎,存储所述数据表中的数据;其中,所述第六索引引擎与写密集型负载的匹配度大于所述第四索引引擎与所述写密集型负载的匹配度。具体可以参考上文对图6中步骤603-步骤607的介绍实现。Step 903b, when the operation amplification includes the write amplification, and the write amplification is greater than the second threshold, instruct the storage device to store the data in the data table according to the sixth index engine; wherein the matching degree between the sixth index engine and the write-intensive load is greater than the matching degree between the fourth index engine and the write-intensive load. For details, please refer to the above description of steps 603 to 607 in Figure 6.
在一些实施例中,所述操作放大同时包括所述读放大和写放大;当所述操作放大包括所述读放大,且所述读放大大于第一阈值时,指示所述存储装置按照第五索引引擎,存储所述数据表中的数据,包括:当所述读放大大于所述第一阈值,且所述写放大小于第三阈值时,指示所述存储装置按照第五索引引擎,存储所述数据表中的数据。具体可以参考上文对图6中步骤603-步骤607的介绍实现。In some embodiments, the operation amplification includes both the read amplification and the write amplification; when the operation amplification includes the read amplification, and the read amplification is greater than the first threshold, instructing the storage device to store the data in the data table according to the fifth index engine, including: when the read amplification is greater than the first threshold, and the write amplification is less than the third threshold, instructing the storage device to store the data in the data table according to the fifth index engine. For details, please refer to the above description of steps 603 to 607 in Figure 6.
在一些实施例中,所述操作放大同时包括所述读放大和写放大;当所述操作放大包括所述写放大,且所述写放大大于第二阈值时,指示所述存储装置按照第六索引引擎,存储所述数据表中的数据,包括:当所述写放大大于所述第二阈值,且所述读放大小于第四阈值时,指示所述存储装置按照第六索引引擎,存储所述数据表中的数据。具体可以参考上文对图6中步骤603-步骤607的介绍实现。In some embodiments, the operation amplification includes both the read amplification and the write amplification; when the operation amplification includes the write amplification, and the write amplification is greater than the second threshold, instructing the storage device to store the data in the data table according to the sixth index engine, including: when the write amplification is greater than the second threshold, and the read amplification is less than the fourth threshold, instructing the storage device to store the data in the data table according to the sixth index engine. For details, please refer to the above description of steps 603 to 607 in Figure 6.
在一些实施例中,所述第五索引引擎至少包括B+树结构,所述第六索引引擎至少包括日志结构合并树结构。In some embodiments, the fifth index engine includes at least a B+ tree structure, and the sixth index engine includes at least a log structure merge tree structure.
在一些实施例中,所述数据表具有本地二级索引LSI,所述操作放大为操作所述本地二级索引下的数据所产生的放大;所述指示所述存储装置按照第五索引引擎,存储所述数据表中的数据,包括:指示所述存储装置按照所述第五索引引擎,存储所述本地二级索引下的数据;或者,所述指示所述存储装置按照第六索引引擎,存储所述数据表中的数据,包括:指示所述存储装置按照第六索引引擎,存储所述本地二级索引下的数据。具体可以参考上文对图6中步骤607的介绍实现。In some embodiments, the data table has a local secondary index LSI, the operation amplification is the amplification generated by operating the data under the local secondary index; the instructing the storage device to store the data in the data table according to the fifth index engine includes: instructing the storage device to store the data under the local secondary index according to the fifth index engine; or, the instructing the storage device to store the data in the data table according to the sixth index engine includes: instructing the storage device to store the data under the local secondary index according to the sixth index engine. For details, please refer to the above introduction to step 607 in Figure 6.
本申请实施例提供的数据存储方法,可以感知数据表的操作放大的动态变化,并根据操作放大的动态变化,动态调整数据表的索引引擎,使得索引引擎与数据表的负载特性相匹配,从而可以提高数据表的访问性能。The data storage method provided in the embodiment of the present application can sense the dynamic changes of operation amplification of a data table, and dynamically adjust the index engine of the data table according to the dynamic changes of operation amplification, so that the index engine matches the load characteristics of the data table, thereby improving the access performance of the data table.
本申请实施例提供了一种数据存储装置1000。装置1000可以配置于存储系统中的控制装置,所述存储系统还包括存储装置,所述存储装置存储有用户的数据表。如图10所示,装置1000包括:The embodiment of the present application provides a data storage device 1000. The device 1000 can be configured in a control device in a storage system, and the storage system also includes a storage device, and the storage device stores a user's data table. As shown in FIG. 10 , the device 1000 includes:
提供模块1010,用于提供配置接口,所述配置接口用于供所述用户配置所述数据表的负载特性为读密集型负载或者写密集型负载;A module 1010 is provided, for providing a configuration interface, wherein the configuration interface is used for the user to configure the load characteristic of the data table as a read-intensive load or a write-intensive load;
指示模块1020,用于当所述配置接口指示所述用户配置所述数据表的负载特性为读密集型负载时,指示所述存储装置按照第一索引引擎,存储所述数据表中的数据;所述第一索引引擎与所述读密集型负载匹配;The instructing module 1020 is used to instruct the storage device to store the data in the data table according to the first indexing engine when the configuration interface indicates that the user configures the load characteristic of the data table as a read-intensive load; the first indexing engine matches the read-intensive load;
所述指示模块1020还用于当所述配置接口指示所述用户配置所述数据表的负载特性为写密集型负载时,指示所述存储装置按照第二索引引擎,存储所述数据表中的数据;所述第二索引引擎与所述写密集型负载匹配。The indication module 1020 is also used to instruct the storage device to store the data in the data table according to the second index engine when the configuration interface indicates that the user configures the load characteristics of the data table as a write-intensive load; the second index engine matches the write-intensive load.
其中,提供模块1010和指示模块1020均可以通过软件实现,或者可以通过硬件实现。示例性的,接下来以提供模块1010为例,介绍提供模块1010的实现方式。类似的,指示模块1020的实现方式可以参考提供模块1010的实现方式。Among them, both the providing module 1010 and the indicating module 1020 can be implemented by software or by hardware. Exemplarily, the implementation of the providing module 1010 is introduced below by taking the providing module 1010 as an example. Similarly, the implementation of the indicating module 1020 can refer to the implementation of the providing module 1010.
模块作为软件功能单元的一种举例,提供模块1010可以包括运行在计算实例上的代码。其中,计算实例可以包括物理主机(计算设备)、虚拟机、容器中的至少一种。进一步地,上述计算实例可以是一台或者多台。例如,提供模块1010可以包括运行在多个主机/虚拟机/容器上的代码。需要说明的是,用于运行该代码的多个主机/虚拟机/容器可以分布在相同的区域(region)中,也可以分布在不同的region中。进一步地,用于运行该代码的多个主机/虚拟机/容器可以分布在相同的可用区(availability zone,AZ)中,也可以分布在不同的AZ中,每个AZ包括一个数据中心或多个地理位置相近的数据中心。其中,通常一个region可以包括多个AZ。As an example of a software functional unit, the module 1010 provided may include code running on a computing instance. Among them, the computing instance may include at least one of a physical host (computing device), a virtual machine, and a container. Further, the above-mentioned computing instance may be one or more. For example, the module 1010 provided may include code running on multiple hosts/virtual machines/containers. It should be noted that the multiple hosts/virtual machines/containers used to run the code may be distributed in the same region (region) or in different regions. Furthermore, the multiple hosts/virtual machines/containers used to run the code may be distributed in the same availability zone (AZ) or in different AZs, each AZ including one data center or multiple data centers with close geographical locations. Among them, usually a region may include multiple AZs.
同样,用于运行该代码的多个主机/虚拟机/容器可以分布在同一个虚拟私有云(virtual private cloud,VPC)中,也可以分布在多个VPC中。其中,通常一个VPC设置在一个region内,同一region内两个VPC之间,以及不同region的VPC之间跨区通信需在每个VPC内设置通信网关,经通信网关实现VPC之间的互连。Similarly, multiple hosts/virtual machines/containers used to run the code can be distributed in the same virtual private cloud (VPC) or in multiple VPCs. Usually, a VPC is set up in a region. For cross-region communication between two VPCs in the same region and between VPCs in different regions, a communication gateway needs to be set up in each VPC to achieve interconnection between VPCs through the communication gateway.
模块作为硬件功能单元的一种举例,提供模块1010可以包括至少一个计算设备,如服务器等。或者,提供模块1010也可以是利用专用集成电路(application-specific integrated circuit,ASIC)实现、 或可编程逻辑器件(programmable logic device,PLD)实现的设备等。其中,上述PLD可以是复杂程序逻辑器件(complex programmable logical device,CPLD)、现场可编程门阵列(field-programmable gate array,FPGA)、通用阵列逻辑(generic array logic,GAL)或其任意组合实现。As an example of a hardware functional unit, the providing module 1010 may include at least one computing device, such as a server, etc. Alternatively, the providing module 1010 may also be implemented using an application-specific integrated circuit (ASIC). Or a device implemented by a programmable logic device (PLD), etc. The PLD may be a complex programmable logical device (CPLD), a field-programmable gate array (FPGA), a generic array logic (GAL), or any combination thereof.
提供模块1010包括的多个计算设备可以分布在相同的region中,也可以分布在不同的region中。提供模块1010包括的多个计算设备可以分布在相同的AZ中,也可以分布在不同的AZ中。同样,提供模块1010包括的多个计算设备可以分布在同一个VPC中,也可以分布在多个VPC中。其中,所述多个计算设备可以是服务器、ASIC、PLD、CPLD、FPGA和GAL等计算设备的任意组合。The multiple computing devices included in the providing module 1010 may be distributed in the same region or in different regions. The multiple computing devices included in the providing module 1010 may be distributed in the same AZ or in different AZs. Similarly, the multiple computing devices included in the providing module 1010 may be distributed in the same VPC or in multiple VPCs. The multiple computing devices may be any combination of computing devices such as servers, ASICs, PLDs, CPLDs, FPGAs, and GALs.
需要说明的是,在其他实施例中,提供模块1010可以用于执行图8所示方法中的任意步骤,指示模块1020可以用于执行图8所示方法中的任意步骤。提供模块1010和指示模块1020负责实现的步骤可根据需要指定,通过提供模块1010和指示模块1020分别实现图8所示方法中不同的步骤来实现数据存储装置1000的全部功能。It should be noted that, in other embodiments, the providing module 1010 can be used to execute any step in the method shown in FIG8, and the indicating module 1020 can be used to execute any step in the method shown in FIG8. The steps that the providing module 1010 and the indicating module 1020 are responsible for implementing can be specified as needed, and the full functions of the data storage device 1000 are realized by implementing different steps in the method shown in FIG8 by the providing module 1010 and the indicating module 1020 respectively.
本申请实施例还提供了一种数据存储装置1100。装置1100可配置于存储系统中的控制装置,所述存储系统还包括存储装置,所述存储装置存储有数据表。如图11所示,装置1100包括:The embodiment of the present application also provides a data storage device 1100. The device 1100 can be configured in a control device in a storage system, and the storage system also includes a storage device, and the storage device stores a data table. As shown in FIG. 11 , the device 1100 includes:
监测模块1110,用于所述控制装置在所述数据表的索引引擎为第四索引引擎的情况下,监测所述数据表的操作放大,所述操作放大包括从所述数据表读取数据的读放大或者向所述数据表写入数据的写放大;A monitoring module 1110, configured for the control device to monitor, when the index engine of the data table is the fourth index engine, the operation amplification of the data table, wherein the operation amplification includes a read amplification of reading data from the data table or a write amplification of writing data to the data table;
指示模块1120,用于当所述操作放大包括所述读放大,且所述读放大大于第一阈值时,指示所述存储装置按照第五索引引擎,存储所述数据表中的数据;其中,所述第五索引引擎与读密集型负载的匹配度大于所述第四索引引擎与所述读密集型负载的匹配度;an indication module 1120, configured to indicate, when the operation amplification includes the read amplification and the read amplification is greater than a first threshold, the storage device to store the data in the data table according to a fifth index engine; wherein the matching degree between the fifth index engine and the read-intensive load is greater than the matching degree between the fourth index engine and the read-intensive load;
指示模块1120还用于当所述操作放大包括所述写放大,且所述写放大大于第二阈值时,指示所述存储装置按照第六索引引擎,存储所述数据表中的数据;其中,所述第六索引引擎与写密集型负载的匹配度大于所述第四索引引擎与所述写密集型负载的匹配度。The indication module 1120 is also used to instruct the storage device to store the data in the data table according to the sixth index engine when the operation amplification includes the write amplification and the write amplification is greater than a second threshold; wherein the matching degree between the sixth index engine and the write-intensive load is greater than the matching degree between the fourth index engine and the write-intensive load.
其中,监测模块1110和指示模块1120均可以通过软件实现,或者可以通过硬件实现。其中,监测模块1110和指示模块1120的实现方式可以参考提供模块1010的实现方式,具体请见上文介绍,在此不再赘述。The monitoring module 1110 and the indicating module 1120 can be implemented by software or hardware. The implementation of the monitoring module 1110 and the indicating module 1120 can refer to the implementation of the providing module 1010, which is described above and will not be repeated here.
需要说明的是,在其他实施例中,监测模块1110可以用于执行图9所示方法中的任意步骤,指示模块1120可以用于执行图9所示方法中的任意步骤。监测模块1110和指示模块1120负责实现的步骤可根据需要指定,通过监测模块1110和指示模块1120分别实现图9所示方法中不同的步骤来实现数据存储装置1100的全部功能。It should be noted that, in other embodiments, the monitoring module 1110 can be used to execute any step in the method shown in FIG9 , and the indicating module 1120 can be used to execute any step in the method shown in FIG9 . The steps that the monitoring module 1110 and the indicating module 1120 are responsible for implementing can be specified as needed, and the monitoring module 1110 and the indicating module 1120 respectively implement different steps in the method shown in FIG9 to implement all functions of the data storage device 1100 .
本申请还提供一种计算设备1200。如图12所示,计算设备1200包括:总线1202、处理器1204、存储器1206和通信接口1208。处理器1204、存储器1206和通信接口1208之间通过总线1202通信。计算设备1200可以是服务器或终端设备。应理解,本申请不限定计算设备1200中的处理器、存储器的个数。The present application also provides a computing device 1200. As shown in FIG. 12 , the computing device 1200 includes: a bus 1202, a processor 1204, a memory 1206, and a communication interface 1208. The processor 1204, the memory 1206, and the communication interface 1208 communicate with each other through the bus 1202. The computing device 1200 may be a server or a terminal device. It should be understood that the present application does not limit the number of processors and memories in the computing device 1200.
总线1202可以是外设部件互连标准(peripheral component interconnect,PCI)总线或扩展工业标准结构(extended industry standard architecture,EISA)总线等。总线可以分为地址总线、数据总线、控制总线等。为便于表示,图12中仅用一条线表示,但并不表示仅有一根总线或一种类型的总线。总线1202可包括在计算设备1200各个部件(例如,存储器1206、处理器1204、通信接口1208)之间传送信息的通路。The bus 1202 may be a peripheral component interconnect (PCI) bus or an extended industry standard architecture (EISA) bus, etc. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of representation, FIG. 12 is represented by only one line, but does not mean that there is only one bus or one type of bus. The bus 1202 may include a path for transmitting information between various components of the computing device 1200 (e.g., the memory 1206, the processor 1204, and the communication interface 1208).
处理器1204可以包括中央处理器(central processing unit,CPU)、图形处理器(graphics processing unit,GPU)、微处理器(micro processor,MP)或者数字信号处理器(digital signal processor,DSP)等处理器中的任意一种或多种。Processor 1204 may include any one or more of a central processing unit (CPU), a graphics processing unit (GPU), a microprocessor (MP), or a digital signal processor (DSP).
存储器1206可以包括易失性存储器(volatile memory),例如随机存取存储器(random access memory,RAM)。存储器1206还可以包括非易失性存储器(non-volatile memory),例如只读存储器(read-only memory,ROM),快闪存储器,机械硬盘(hard disk drive,HDD)或固态硬盘(solid state drive,SSD)。The memory 1206 may include a volatile memory (volatile memory), such as a random access memory (RAM). The memory 1206 may also include a non-volatile memory (non-volatile memory), such as a read-only memory (ROM), a flash memory, a hard disk drive (HDD), or a solid state drive (SSD).
存储器1206中存储有可执行的程序代码,处理器1204执行该可执行的程序代码以分别实现前述提供模块1010、指示模块1020的功能,从而实现图8所示方法。也即,存储器1206上存有用于执行图8所示方法的指令。The memory 1206 stores executable program codes, and the processor 1204 executes the executable program codes to respectively implement the functions of the providing module 1010 and the indicating module 1020, thereby implementing the method shown in Figure 8. That is, the memory 1206 stores instructions for executing the method shown in Figure 8.
通信接口1208使用例如但不限于网络接口卡、收发器一类的收发模块,来实现计算设备1200与其他设备或通信网络之间的通信。The communication interface 1208 uses a transceiver module such as, but not limited to, a network interface card or a transceiver to implement communication between the computing device 1200 and other devices or a communication network.
本申请实施例还提供了一种计算设备集群。该计算设备集群包括至少一台计算设备。该计算设备可以是服务器,例如是中心服务器、边缘服务器,或者是本地数据中心中的本地服务器。在一些实施例中,计 算设备也可以是台式机、笔记本电脑或者智能手机等终端设备。The present application also provides a computing device cluster. The computing device cluster includes at least one computing device. The computing device may be a server, such as a central server, an edge server, or a local server in a local data center. In some embodiments, the computing device cluster includes at least one computing device. The computing device can also be a terminal device such as a desktop computer, a laptop computer or a smart phone.
如图13所示,所述计算设备集群包括至少一个计算设备1200。计算设备集群中的一个或多个计算设备1200中的存储器1206中可以存有相同的用于执行图8所示方法的指令。As shown in Fig. 13, the computing device cluster includes at least one computing device 1200. The memory 1206 in one or more computing devices 1200 in the computing device cluster may store the same instructions for executing the method shown in Fig. 8 .
在一些可能的实现方式中,该计算设备集群中的一个或多个计算设备1200的存储器1206中也可以分别存有用于执行图8所示方法的部分指令。换言之,一个或多个计算设备1200的组合可以共同执行用于执行图8所示方法的指令。In some possible implementations, the memory 1206 of one or more computing devices 1200 in the computing device cluster may also respectively store some instructions for executing the method shown in Figure 8. In other words, the combination of one or more computing devices 1200 may jointly execute instructions for executing the method shown in Figure 8.
需要说明的是,计算设备集群中的不同的计算设备1200中的存储器1206可以存储不同的指令,分别用于执行装置1000的部分功能。也即,不同的计算设备1200中的存储器1206存储的指令可以实现提供模块1010、指示模块1020中的一个或多个模块的功能。It should be noted that the memory 1206 in different computing devices 1200 in the computing device cluster may store different instructions, which are respectively used to execute part of the functions of the apparatus 1000. That is, the instructions stored in the memory 1206 in different computing devices 1200 may implement the functions of one or more of the providing module 1010 and the indicating module 1020.
在一些可能的实现方式中,计算设备集群中的一个或多个计算设备可以通过网络连接。其中,所述网络可以是广域网或局域网等等。图14示出了一种可能的实现方式。如图14所示,两个计算设备1200A和1200B之间通过网络进行连接。具体地,通过各个计算设备中的通信接口与所述网络进行连接。在这一类可能的实现方式中,计算设备1200A中的存储器1206中存有执行提供模块1010的功能的指令。同时,计算设备1200B中的存储器1206中存有执行指示模块1020的功能的指令。In some possible implementations, one or more computing devices in the computing device cluster can be connected via a network. The network can be a wide area network or a local area network, etc. FIG. 14 shows a possible implementation. As shown in FIG. 14 , two computing devices 1200A and 1200B are connected via a network. Specifically, the network is connected via a communication interface in each computing device. In this type of possible implementation, the memory 1206 in the computing device 1200A stores instructions for executing the functions of the providing module 1010. At the same time, the memory 1206 in the computing device 1200B stores instructions for executing the functions of the indicating module 1020.
应理解,图14中示出的计算设备1200A的功能也可以由多个计算设备1200完成。同样,计算设备1200B的功能也可以由多个计算设备1200完成。It should be understood that the functionality of the computing device 1200A shown in FIG14 may also be accomplished by multiple computing devices 1200. Similarly, the functionality of the computing device 1200B may also be accomplished by multiple computing devices 1200.
本申请实施例还提供了另一种计算设备集群。该计算设备集群中各计算设备之间的连接关系可以类似的参考图13和图14所述计算设备集群的连接方式。不同的是,该计算设备集群中的一个或多个计算设备1200中的存储器1206中可以存有相同的用于执行图8所示方法的指令。The embodiment of the present application also provides another computing device cluster. The connection relationship between the computing devices in the computing device cluster can be similar to the connection mode of the computing device cluster described in Figures 13 and 14. The difference is that the memory 1206 in one or more computing devices 1200 in the computing device cluster can store the same instructions for executing the method shown in Figure 8.
在一些可能的实现方式中,该计算设备集群中的一个或多个计算设备1200的存储器1206中也可以分别存有用于执行图8所示方法的部分指令。换言之,一个或多个计算设备1200的组合可以共同执行用于执行图8所示方法的指令。In some possible implementations, the memory 1206 of one or more computing devices 1200 in the computing device cluster may also respectively store some instructions for executing the method shown in Figure 8. In other words, the combination of one or more computing devices 1200 may jointly execute instructions for executing the method shown in Figure 8.
本申请还提供一种计算设备1500。如图15所示,计算设备1500包括:总线1502、处理器1504、存储器1506和通信接口1508。处理器1504、存储器1506和通信接口1508之间通过总线1502通信。计算设备1500可以是服务器或终端设备。应理解,本申请不限定计算设备1500中的处理器、存储器的个数。The present application also provides a computing device 1500. As shown in FIG. 15 , the computing device 1500 includes: a bus 1502, a processor 1504, a memory 1506, and a communication interface 1508. The processor 1504, the memory 1506, and the communication interface 1508 communicate with each other through the bus 1502. The computing device 1500 can be a server or a terminal device. It should be understood that the present application does not limit the number of processors and memories in the computing device 1500.
其中,总线1502、处理器1504、存储器1506和通信接口1508的实现方式可以分别参考总线1202、处理器1204、存储器1206和通信接口1208的实现方式。The implementations of the bus 1502 , the processor 1504 , the memory 1506 , and the communication interface 1508 may refer to the implementations of the bus 1202 , the processor 1204 , the memory 1206 , and the communication interface 1208 , respectively.
存储器1506中存储有可执行的程序代码,处理器1504执行该可执行的程序代码以分别实现前述监测模块1110和指示模块1120的功能,从而实现图9所示方法。也即,存储器1506上存有用于执行图9所示方法的指令。The memory 1506 stores executable program codes, and the processor 1504 executes the executable program codes to respectively implement the functions of the aforementioned monitoring module 1110 and the indication module 1120, thereby implementing the method shown in Figure 9. That is, the memory 1506 stores instructions for executing the method shown in Figure 9.
通信接口1508使用例如但不限于网络接口卡、收发器一类的收发模块,来实现计算设备1500与其他设备或通信网络之间的通信。The communication interface 1508 uses a transceiver module such as, but not limited to, a network interface card or a transceiver to implement communication between the computing device 1500 and other devices or communication networks.
本申请实施例还提供了一种计算设备集群。该计算设备集群包括至少一台计算设备。该计算设备可以是服务器,例如是中心服务器、边缘服务器,或者是本地数据中心中的本地服务器。在一些实施例中,计算设备也可以是台式机、笔记本电脑或者智能手机等终端设备。The embodiment of the present application also provides a computing device cluster. The computing device cluster includes at least one computing device. The computing device can be a server, such as a central server, an edge server, or a local server in a local data center. In some embodiments, the computing device can also be a terminal device such as a desktop computer, a laptop computer, or a smart phone.
如图16所示,所述计算设备集群包括至少一个计算设备1500。计算设备集群中的一个或多个计算设备1500中的存储器1506中可以存有相同的用于执行图9所示方法的指令。As shown in Fig. 16, the computing device cluster includes at least one computing device 1500. The memory 1506 in one or more computing devices 1500 in the computing device cluster may store the same instructions for executing the method shown in Fig. 9.
在一些可能的实现方式中,该计算设备集群中的一个或多个计算设备1500的存储器1506中也可以分别存有用于执行图9所示方法的部分指令。换言之,一个或多个计算设备1500的组合可以共同执行用于执行图9所示方法的指令。In some possible implementations, the memory 1506 of one or more computing devices 1500 in the computing device cluster may also respectively store some instructions for executing the method shown in Figure 9. In other words, the combination of one or more computing devices 1500 may jointly execute instructions for executing the method shown in Figure 9.
需要说明的是,计算设备集群中的不同的计算设备1500中的存储器1506可以存储不同的指令,分别用于执行装置1100的部分功能。也即,不同的计算设备1500中的存储器1506存储的指令可以实现监测模块1110和指示模块1120中的一个或多个模块的功能。It should be noted that the memory 1506 in different computing devices 1500 in the computing device cluster may store different instructions, which are respectively used to execute part of the functions of the apparatus 1100. That is, the instructions stored in the memory 1506 in different computing devices 1500 may implement the functions of one or more modules in the monitoring module 1110 and the indication module 1120.
在一些可能的实现方式中,计算设备集群中的一个或多个计算设备可以通过网络连接。其中,所述网络可以是广域网或局域网等等。图17示出了一种可能的实现方式。如图17所示,两个计算设备1500A和1500B之间通过网络进行连接。具体地,通过各个计算设备中的通信接口与所述网络进行连接。在这一类可能的实现方式中,计算设备1500A中的存储器1506中存有执行监测模块1110的功能的指令。同时,计算设备1500B中的存储器1506中存有执行指示模块1120的功能的指令。In some possible implementations, one or more computing devices in the computing device cluster can be connected via a network. Wherein, the network can be a wide area network or a local area network, etc. Figure 17 shows a possible implementation. As shown in Figure 17, two computing devices 1500A and 1500B are connected via a network. Specifically, the network is connected via a communication interface in each computing device. In this type of possible implementation, the memory 1506 in the computing device 1500A stores instructions for executing the functions of the monitoring module 1110. At the same time, the memory 1506 in the computing device 1500B stores instructions for executing the functions of the indication module 1120.
应理解,图17中示出的计算设备1500A的功能也可以由多个计算设备1500完成。同样,计算设备 1500B的功能也可以由多个计算设备1500完成。It should be understood that the functions of the computing device 1500A shown in FIG. 17 may also be performed by multiple computing devices 1500. The functionality of 1500B may also be performed by multiple computing devices 1500 .
本申请实施例还提供了另一种计算设备集群。该计算设备集群中各计算设备之间的连接关系可以类似的参考图16和图17所述计算设备集群的连接方式。不同的是,该计算设备集群中的一个或多个计算设备1500中的存储器1506中可以存有相同的用于执行图9所示方法的指令。The embodiment of the present application also provides another computing device cluster. The connection relationship between the computing devices in the computing device cluster can be similar to the connection mode of the computing device cluster described in Figures 16 and 17. The difference is that the memory 1506 in one or more computing devices 1500 in the computing device cluster can store the same instructions for executing the method shown in Figure 9.
在一些可能的实现方式中,该计算设备集群中的一个或多个计算设备1500的存储器1506中也可以分别存有用于执行图9所示方法的部分指令。换言之,一个或多个计算设备1500的组合可以共同执行用于执行图9所示方法的指令。In some possible implementations, the memory 1506 of one or more computing devices 1500 in the computing device cluster may also respectively store some instructions for executing the method shown in Figure 9. In other words, the combination of one or more computing devices 1500 may jointly execute instructions for executing the method shown in Figure 9.
本申请实施例还提供了一种包含指令的计算机程序产品。所述计算机程序产品可以是包含指令的,能够运行在计算设备上或被储存在任何可用介质中的软件或程序产品。当所述计算机程序产品在至少一个计算设备上运行时,使得至少一个计算设备执行图8所示方法。The embodiment of the present application also provides a computer program product including instructions. The computer program product may be software or a program product including instructions that can be run on a computing device or stored in any available medium. When the computer program product is run on at least one computing device, the at least one computing device executes the method shown in FIG8 .
本申请实施例还提供了一种包含指令的计算机程序产品。所述计算机程序产品可以是包含指令的,能够运行在计算设备上或被储存在任何可用介质中的软件或程序产品。当所述计算机程序产品在至少一个计算设备上运行时,使得至少一个计算设备执行图9所示方法。The embodiment of the present application also provides a computer program product including instructions. The computer program product may be software or a program product including instructions that can be run on a computing device or stored in any available medium. When the computer program product is run on at least one computing device, the at least one computing device executes the method shown in FIG. 9 .
本申请实施例还提供了一种计算机可读存储介质。所述计算机可读存储介质可以是计算设备能够存储的任何可用介质或者是包含一个或多个可用介质的数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质(例如固态硬盘)等。该计算机可读存储介质包括指令,所述指令指示计算设备执行图8所示方法。The embodiment of the present application also provides a computer-readable storage medium. The computer-readable storage medium can be any available medium that can be stored by a computing device or a data storage device such as a data center containing one or more available media. The available medium can be a magnetic medium (e.g., a floppy disk, a hard disk, a tape), an optical medium (e.g., a DVD), or a semiconductor medium (e.g., a solid-state hard disk). The computer-readable storage medium includes instructions that instruct the computing device to execute the method shown in Figure 8.
本申请实施例还提供了一种计算机可读存储介质。所述计算机可读存储介质可以是计算设备能够存储的任何可用介质或者是包含一个或多个可用介质的数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质(例如固态硬盘)等。该计算机可读存储介质包括指令,所述指令指示计算设备执行图9所示方法。The embodiment of the present application also provides a computer-readable storage medium. The computer-readable storage medium can be any available medium that can be stored by a computing device or a data storage device such as a data center containing one or more available media. The available medium can be a magnetic medium (e.g., a floppy disk, a hard disk, a tape), an optical medium (e.g., a DVD), or a semiconductor medium (e.g., a solid-state hard disk). The computer-readable storage medium includes instructions that instruct the computing device to execute the method shown in Figure 9.
最后应说明的是:以上实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的保护范围。 Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, rather than to limit it. Although the present invention has been described in detail with reference to the aforementioned embodiments, those skilled in the art should understand that they can still modify the technical solutions described in the aforementioned embodiments, or make equivalent replacements for some of the technical features therein. However, these modifications or replacements do not cause the essence of the corresponding technical solutions to deviate from the protection scope of the technical solutions of the embodiments of the present invention.

Claims (21)

  1. 一种数据存储方法,其特征在于,应用于存储系统中的控制装置,所述存储系统还包括存储装置,所述存储装置存储有用户的数据表;所述方法包括:A data storage method, characterized in that it is applied to a control device in a storage system, wherein the storage system further comprises a storage device, and the storage device stores a user's data table; the method comprises:
    提供配置接口,所述配置接口用于供所述用户配置所述数据表的负载特性为读密集型负载或者写密集型负载;Providing a configuration interface, wherein the configuration interface is used for the user to configure the load characteristics of the data table as a read-intensive load or a write-intensive load;
    当所述配置接口指示所述用户配置所述数据表的负载特性为读密集型负载时,指示所述存储装置按照第一索引引擎,存储所述数据表中的数据;所述第一索引引擎与所述读密集型负载匹配;When the configuration interface instructs the user to configure the load characteristic of the data table as a read-intensive load, instructing the storage device to store the data in the data table according to a first index engine; the first index engine matches the read-intensive load;
    当所述配置接口指示所述用户配置所述数据表的负载特性为写密集型负载时,指示所述存储装置按照第二索引引擎,存储所述数据表中的数据;所述第二索引引擎与所述写密集型负载匹配。When the configuration interface instructs the user to configure the load characteristic of the data table as a write-intensive load, the storage device is instructed to store the data in the data table according to a second index engine; the second index engine matches the write-intensive load.
  2. 根据权利要求1所述的方法,其特征在于,所述第一索引引擎至少包括B+树结构,所述第二索引引擎至少包括日志结构合并树结构。The method according to claim 1 is characterized in that the first index engine includes at least a B+ tree structure, and the second index engine includes at least a log structure merged tree structure.
  3. 根据权利要求1或2所述的方法,其特征在于,在所述指示所述存储装置按照第一索引引擎,存储所述数据表中的数据之前,所述数据表中的索引引擎是所述第二索引引擎;The method according to claim 1 or 2, characterized in that before instructing the storage device to store the data in the data table according to the first index engine, the index engine in the data table is the second index engine;
    所述指示所述存储装置按照第一索引引擎,存储所述数据表中的数据,包括:The instructing the storage device to store the data in the data table according to the first index engine includes:
    指示所述存储装置先将所述数据表中的索引引擎从所述第二索引引擎迁移为第三索引引擎,所述第三索引引擎的结构介于所述第一索引引擎的结构和所述第二索引引擎的结构之间;Instructing the storage device to first migrate the index engine in the data table from the second index engine to a third index engine, wherein the structure of the third index engine is between the structure of the first index engine and the structure of the second index engine;
    再指示所述存储装置将所述数据表的索引引擎从所述第三索引引擎迁移为所述第一索引引擎。Then, the storage device is instructed to migrate the index engine of the data table from the third index engine to the first index engine.
  4. 根据权利要求1-3任一项所述的方法,其特征在于,所述数据表具有本地二级索引,所述配置接口还用于供所述用户配置所述本地二级索引的负载特性为读密集型负载或者写密集型负载;The method according to any one of claims 1 to 3, characterized in that the data table has a local secondary index, and the configuration interface is further used for the user to configure the load characteristics of the local secondary index as a read-intensive load or a write-intensive load;
    当所述配置接口指示所述用户配置所述本地二级索引的负载特性为读密集型负载时,指示所述存储装置按照所述第一索引引擎,存储所述本地二级索引下的数据;When the configuration interface indicates that the user configures the load characteristic of the local secondary index as a read-intensive load, instructing the storage device to store the data under the local secondary index according to the first index engine;
    当所述配置接口指示所述用户配置所述本地二级索引的负载特性为写密集型负载时,指示所述存储装置按照第二索引引擎,存储所述本地二级索引下的数据。When the configuration interface instructs the user to configure the load characteristic of the local secondary index as a write-intensive load, the storage device is instructed to store the data under the local secondary index according to the second index engine.
  5. 一种数据存储方法,其特征在于,应用于存储系统中的控制装置,所述存储系统还包括存储装置,所述存储装置存储有数据表;所述方法包括:A data storage method, characterized in that it is applied to a control device in a storage system, wherein the storage system further comprises a storage device, and the storage device stores a data table; the method comprises:
    所述控制装置在所述数据表的索引引擎为第四索引引擎的情况下,监测所述数据表的操作放大,所述操作放大包括从所述数据表读取数据的读放大或者向所述数据表写入数据的写放大;The control device monitors the operation amplification of the data table when the index engine of the data table is a fourth index engine, the operation amplification including read amplification of reading data from the data table or write amplification of writing data to the data table;
    当所述操作放大包括所述读放大,且所述读放大大于第一阈值时,指示所述存储装置按照第五索引引擎,存储所述数据表中的数据;其中,所述第五索引引擎与读密集型负载的匹配度大于所述第四索引引擎与所述读密集型负载的匹配度;When the operation amplification includes the read amplification, and the read amplification is greater than a first threshold, instructing the storage device to store the data in the data table according to a fifth index engine; wherein the matching degree between the fifth index engine and the read-intensive load is greater than the matching degree between the fourth index engine and the read-intensive load;
    当所述操作放大包括所述写放大,且所述写放大大于第二阈值时,指示所述存储装置按照第六索引引擎,存储所述数据表中的数据;其中,所述第六索引引擎与写密集型负载的匹配度大于所述第四索引引擎与所述写密集型负载的匹配度。When the operation amplification includes the write amplification and the write amplification is greater than a second threshold, the storage device is instructed to store the data in the data table according to a sixth index engine; wherein the matching degree between the sixth index engine and the write-intensive load is greater than the matching degree between the fourth index engine and the write-intensive load.
  6. 根据权利要求5所述的方法,其特征在于,所述操作放大同时包括所述读放大和写放大;The method according to claim 5, characterized in that the operational amplification includes both the read amplification and the write amplification;
    当所述操作放大包括所述读放大,且所述读放大大于第一阈值时,指示所述存储装置按照第五索引引擎,存储所述数据表中的数据,包括:When the operation amplification includes the read amplification, and the read amplification is greater than a first threshold, instructing the storage device to store the data in the data table according to the fifth index engine, including:
    当所述读放大大于所述第一阈值,且所述写放大小于第三阈值时,指示所述存储装置按照第五索引引擎,存储所述数据表中的数据。When the read amplification is greater than the first threshold and the write amplification is less than a third threshold, the storage device is instructed to store the data in the data table according to the fifth index engine.
  7. 根据权利要求5或6所述的方法,其特征在于,所述操作放大同时包括所述读放大和写放大;The method according to claim 5 or 6, characterized in that the operational amplification includes both the read amplification and the write amplification;
    当所述操作放大包括所述写放大,且所述写放大大于第二阈值时,指示所述存储装置按照第六索引引擎,存储所述数据表中的数据,包括:When the operation amplification includes the write amplification, and the write amplification is greater than a second threshold, instructing the storage device to store the data in the data table according to the sixth index engine, including:
    当所述写放大大于所述第二阈值,且所述读放大小于第四阈值时,指示所述存储装置按照第六索引引擎,存储所述数据表中的数据。When the write amplification is greater than the second threshold and the read amplification is less than a fourth threshold, the storage device is instructed to store the data in the data table according to the sixth index engine.
  8. 根据权利要求5-7任一项所述的方法,其特征在于,所述第五索引引擎至少包括B+树结构,所述第六索引引擎至少包括日志结构合并树结构。The method according to any one of claims 5-7 is characterized in that the fifth index engine includes at least a B+ tree structure, and the sixth index engine includes at least a log structure merged with a tree structure.
  9. 根据权利要求5-8任一项所述的方法,其特征在于,所述数据表具有本地二级索引LSI,所述操作放大为操作所述本地二级索引下的数据所产生的放大;The method according to any one of claims 5 to 8, characterized in that the data table has a local secondary index LSI, and the operation amplification is the amplification generated by operating the data under the local secondary index;
    所述指示所述存储装置按照第五索引引擎,存储所述数据表中的数据,包括:指示所述存储装置按照所述第五索引引擎,存储所述本地二级索引下的数据;或者, The instructing the storage device to store the data in the data table according to the fifth index engine includes: instructing the storage device to store the data under the local secondary index according to the fifth index engine; or
    所述指示所述存储装置按照第六索引引擎,存储所述数据表中的数据,包括:指示所述存储装置按照第六索引引擎,存储所述本地二级索引下的数据。The instructing the storage device to store the data in the data table according to the sixth index engine includes: instructing the storage device to store the data under the local secondary index according to the sixth index engine.
  10. 一种数据存储装置,其特征在于,配置于存储系统中的控制装置,所述存储系统还包括存储装置,所述存储装置存储有用户的数据表;所述数据存储装置包括:A data storage device, characterized in that it is a control device configured in a storage system, the storage system further comprising a storage device, the storage device storing a user's data table; the data storage device comprises:
    提供模块,用于提供配置接口,所述配置接口用于供所述用户配置所述数据表的负载特性为读密集型负载或者写密集型负载;A module is provided, which is used to provide a configuration interface, wherein the configuration interface is used for the user to configure the load characteristics of the data table as a read-intensive load or a write-intensive load;
    指示模块,用于当所述配置接口指示所述用户配置所述数据表的负载特性为读密集型负载时,指示所述存储装置按照第一索引引擎,存储所述数据表中的数据;所述第一索引引擎与所述读密集型负载匹配;an indication module, configured to, when the configuration interface indicates that the user configures the load characteristic of the data table as a read-intensive load, indicate the storage device to store the data in the data table according to a first index engine; the first index engine matches the read-intensive load;
    所述指示模块还用于当所述配置接口指示所述用户配置所述数据表的负载特性为写密集型负载时,指示所述存储装置按照第二索引引擎,存储所述数据表中的数据;所述第二索引引擎与所述写密集型负载匹配。The indication module is also used to instruct the storage device to store the data in the data table according to the second index engine when the configuration interface instructs the user to configure the load characteristics of the data table as a write-intensive load; the second index engine matches the write-intensive load.
  11. 根据权利要求10所述的数据存储装置,其特征在于,所述第一索引引擎至少包括B+树结构,所述第二索引引擎至少包括日志结构合并树结构。The data storage device according to claim 10 is characterized in that the first index engine includes at least a B+ tree structure, and the second index engine includes at least a log structure merged tree structure.
  12. 根据权利要求10或11所述的数据存储装置,其特征在于,在所述指示所述存储装置按照第一索引引擎,存储所述数据表中的数据之前,所述数据表中的索引引擎是所述第二索引引擎;所述指示模块还用于:The data storage device according to claim 10 or 11, characterized in that before instructing the storage device to store the data in the data table according to the first index engine, the index engine in the data table is the second index engine; and the instruction module is further used to:
    指示所述存储装置先将所述数据表中的索引引擎从所述第二索引引擎迁移为第三索引引擎,所述第三索引引擎的结构介于所述第一索引引擎的结构和所述第二索引引擎的结构之间;Instructing the storage device to first migrate the index engine in the data table from the second index engine to a third index engine, wherein the structure of the third index engine is between the structure of the first index engine and the structure of the second index engine;
    再指示所述存储装置将所述数据表的索引引擎从所述第三索引引擎迁移为所述第一索引引擎。Then, the storage device is instructed to migrate the index engine of the data table from the third index engine to the first index engine.
  13. 根据权利要求10-12任一项所述的数据存储装置,其特征在于,所述数据表具有本地二级索引,所述配置接口还用于供所述用户配置所述本地二级索引的负载特性为读密集型负载或者写密集型负载;所述指示模块还用于:The data storage device according to any one of claims 10 to 12, characterized in that the data table has a local secondary index, and the configuration interface is further used for the user to configure the load characteristics of the local secondary index as a read-intensive load or a write-intensive load; the indication module is also used to:
    当所述配置接口指示所述用户配置所述本地二级索引的负载特性为读密集型负载时,指示所述存储装置按照所述第一索引引擎,存储所述本地二级索引下的数据;When the configuration interface indicates that the user configures the load characteristic of the local secondary index as a read-intensive load, instructing the storage device to store the data under the local secondary index according to the first index engine;
    当所述配置接口指示所述用户配置所述本地二级索引的负载特性为写密集型负载时,指示所述存储装置按照第二索引引擎,存储所述本地二级索引下的数据。When the configuration interface instructs the user to configure the load characteristic of the local secondary index as a write-intensive load, the storage device is instructed to store the data under the local secondary index according to the second index engine.
  14. 一种数据存储装置,其特征在于,配置于存储系统中的控制装置,所述存储系统还包括存储装置,所述存储装置存储有数据表;所述数据存储装置包括:A data storage device, characterized in that it is a control device configured in a storage system, the storage system further includes a storage device, the storage device stores a data table; the data storage device includes:
    监测模块,用于所述控制装置在所述数据表的索引引擎为第四索引引擎的情况下,监测所述数据表的操作放大,所述操作放大包括从所述数据表读取数据的读放大或者向所述数据表写入数据的写放大;a monitoring module, used for the control device to monitor the operation amplification of the data table when the index engine of the data table is the fourth index engine, the operation amplification including read amplification of reading data from the data table or write amplification of writing data to the data table;
    指示模块,用于当所述操作放大包括所述读放大,且所述读放大大于第一阈值时,指示所述存储装置按照第五索引引擎,存储所述数据表中的数据;其中,所述第五索引引擎与读密集型负载的匹配度大于所述第四索引引擎与所述读密集型负载的匹配度;an indication module, configured to indicate, when the operation amplification includes the read amplification and the read amplification is greater than a first threshold, the storage device to store the data in the data table according to a fifth index engine; wherein the matching degree between the fifth index engine and the read-intensive load is greater than the matching degree between the fourth index engine and the read-intensive load;
    所述指示模块还用于当所述操作放大包括所述写放大,且所述写放大大于第二阈值时,指示所述存储装置按照第六索引引擎,存储所述数据表中的数据;其中,所述第六索引引擎与写密集型负载的匹配度大于所述第四索引引擎与所述写密集型负载的匹配度。The indication module is also used to instruct the storage device to store the data in the data table according to a sixth index engine when the operation amplification includes the write amplification and the write amplification is greater than a second threshold; wherein the matching degree between the sixth index engine and the write-intensive load is greater than the matching degree between the fourth index engine and the write-intensive load.
  15. 根据权利要求14所述的数据存储装置,其特征在于,所述操作放大同时包括所述读放大和写放大;The data storage device according to claim 14, characterized in that the operational amplification includes both the read amplification and the write amplification;
    所述指示模块用于:当所述读放大大于所述第一阈值,且所述写放大小于第三阈值时,指示所述存储装置按照第五索引引擎,存储所述数据表中的数据。The indication module is used for: when the read amplification is greater than the first threshold and the write amplification is less than a third threshold, indicating the storage device to store the data in the data table according to the fifth index engine.
  16. 根据权利要求14或15所述的数据存储装置,其特征在于,所述操作放大同时包括所述读放大和写放大;The data storage device according to claim 14 or 15, characterized in that the operational amplification includes both the read amplification and the write amplification;
    所述指示模块用于:当所述写放大大于所述第二阈值,且所述读放大小于第四阈值时,指示所述存储装置按照第六索引引擎,存储所述数据表中的数据。The indication module is used for: when the write amplification is greater than the second threshold and the read amplification is less than a fourth threshold, indicating the storage device to store the data in the data table according to the sixth index engine.
  17. 根据权利要求14-16任一项所述的数据存储装置,其特征在于,所述第五索引引擎至少包括B+树结构,所述第六索引引擎至少包括日志结构合并树结构。The data storage device according to any one of claims 14 to 16 is characterized in that the fifth index engine includes at least a B+ tree structure, and the sixth index engine includes at least a log structure merged tree structure.
  18. 根据权利要求14-17任一项所述的数据存储装置,其特征在于,所述数据表具有本地二级索引LS I,所述操作放大为操作所述本地二级索引下的数据所产生的放大;The data storage device according to any one of claims 14 to 17, characterized in that the data table has a local secondary index LSI, and the operation amplification is the amplification generated by operating the data under the local secondary index;
    所述指示模块用于:The indication module is used for:
    指示所述存储装置按照所述第五索引引擎,存储所述本地二级索引下的数据;或者, instructing the storage device to store the data under the local secondary index according to the fifth index engine; or,
    指示所述存储装置按照第六索引引擎,存储所述本地二级索引下的数据。Instruct the storage device to store the data under the local secondary index according to the sixth index engine.
  19. 一种计算设备集群,其特征在于,包括至少一个计算设备,每个计算设备包括处理器和存储器;A computing device cluster, characterized in that it includes at least one computing device, each computing device includes a processor and a memory;
    所述至少一个计算设备的处理器用于执行所述至少一个计算设备的存储器中存储的指令,以使得所述计算设备集群执行如权利要求1-4任一项所述的方法或者如权利要求5-9任一项所述的方法。The processor of the at least one computing device is configured to execute instructions stored in the memory of the at least one computing device, so that the computing device cluster executes the method according to any one of claims 1 to 4 or the method according to any one of claims 5 to 9.
  20. 一种包含指令的计算机程序产品,其特征在于,当所述指令被计算设备集群运行时,使得所述计算设备集群执行如权利要求1-4任一项所述的方法或者如权利要求5-9任一项所述的方法。A computer program product comprising instructions, characterized in that when the instructions are executed by a computing device cluster, the computing device cluster executes the method according to any one of claims 1 to 4 or the method according to any one of claims 5 to 9.
  21. 一种计算机可读存储介质,其特征在于,包括计算机程序指令,当所述计算机程序指令由计算设备集群执行时,所述计算设备集群执行如权利要求1-4任一项所述的方法或者如权利要求5-9任一项所述的方法。 A computer-readable storage medium, characterized in that it includes computer program instructions. When the computer program instructions are executed by a computing device cluster, the computing device cluster executes the method according to any one of claims 1 to 4 or the method according to any one of claims 5 to 9.
PCT/CN2023/104709 2022-09-29 2023-06-30 Data storage method and apparatus WO2024066597A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211202134.2 2022-09-29
CN202211202134.2A CN117827818A (en) 2022-09-29 2022-09-29 Data storage method and device

Publications (1)

Publication Number Publication Date
WO2024066597A1 true WO2024066597A1 (en) 2024-04-04

Family

ID=90475955

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/104709 WO2024066597A1 (en) 2022-09-29 2023-06-30 Data storage method and apparatus

Country Status (2)

Country Link
CN (1) CN117827818A (en)
WO (1) WO2024066597A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150324135A1 (en) * 2014-05-06 2015-11-12 Netapp, Inc. Automatic storage system configuration based on workload monitoring
CN109800185A (en) * 2018-12-29 2019-05-24 上海霄云信息科技有限公司 A kind of data cache method in data-storage system
CN111475507A (en) * 2020-03-31 2020-07-31 浙江大学 Key value data indexing method for workload self-adaptive single-layer L SMT
WO2022126863A1 (en) * 2020-12-16 2022-06-23 跬云(上海)信息科技有限公司 Cloud orchestration system and method based on read-write separation and auto-scaling
CN114896250A (en) * 2022-05-19 2022-08-12 中国地质大学(北京) Key value separated key value storage engine index optimization method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150324135A1 (en) * 2014-05-06 2015-11-12 Netapp, Inc. Automatic storage system configuration based on workload monitoring
CN109800185A (en) * 2018-12-29 2019-05-24 上海霄云信息科技有限公司 A kind of data cache method in data-storage system
CN111475507A (en) * 2020-03-31 2020-07-31 浙江大学 Key value data indexing method for workload self-adaptive single-layer L SMT
WO2022126863A1 (en) * 2020-12-16 2022-06-23 跬云(上海)信息科技有限公司 Cloud orchestration system and method based on read-write separation and auto-scaling
CN114896250A (en) * 2022-05-19 2022-08-12 中国地质大学(北京) Key value separated key value storage engine index optimization method and device

Also Published As

Publication number Publication date
CN117827818A (en) 2024-04-05

Similar Documents

Publication Publication Date Title
US11797498B2 (en) Systems and methods of database tenant migration
US8396836B1 (en) System for mitigating file virtualization storage import latency
US7818515B1 (en) System and method for enforcing device grouping rules for storage virtualization
US8392372B2 (en) Methods and systems for snapshot reconstitution
US8874626B2 (en) Tracking files and directories related to unsuccessful change operations
US9805105B1 (en) Automatically creating multiple replication sessions in response to a single replication command entered by a user
US9317377B1 (en) Single-ended deduplication using cloud storage protocol
US7328287B1 (en) System and method for managing I/O access policies in a storage environment employing asymmetric distributed block virtualization
WO2024021488A1 (en) Metadata storage method and apparatus based on distributed key-value database
US10725971B2 (en) Consistent hashing configurations supporting multi-site replication
CN109684270A (en) Database filing method, apparatus, system, equipment and readable storage medium storing program for executing
US11294931B1 (en) Creating replicas from across storage groups of a time series database
WO2024066597A1 (en) Data storage method and apparatus
WO2016144987A1 (en) Architecture for large data management in communication applications through multiple mailboxes
US7293191B1 (en) System and method for managing I/O errors in a storage environment employing asymmetric distributed block virtualization
US11301417B1 (en) Stub file selection and migration
US10083225B2 (en) Dynamic alternate keys for use in file systems utilizing a keyed index
US11620194B1 (en) Managing failover between data streams
WO2024114284A1 (en) Cloud service-based transaction processing method and apparatus, and computing device cluster
US11853317B1 (en) Creating replicas using queries to a time series database
WO2024109415A1 (en) Database redistribution method and system, and device cluster and storage medium
WO2022032532A1 (en) Sharding for workflow applications in serverless architectures
US12007945B2 (en) Snapshot migration between cloud storage platforms
WO2024040902A1 (en) Data access method, distributed database system and computing device cluster
WO2024001280A1 (en) Data flow perception method and related apparatus

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23869845

Country of ref document: EP

Kind code of ref document: A1