US11106628B2 - Method, device and computer program product for storing metadata - Google Patents
Method, device and computer program product for storing metadata Download PDFInfo
- Publication number
- US11106628B2 US11106628B2 US16/442,318 US201916442318A US11106628B2 US 11106628 B2 US11106628 B2 US 11106628B2 US 201916442318 A US201916442318 A US 201916442318A US 11106628 B2 US11106628 B2 US 11106628B2
- Authority
- US
- United States
- Prior art keywords
- key
- value pairs
- routine
- group
- value
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active, expires
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/061—Improving I/O performance
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/14—Details of searching files based on file metadata
- G06F16/144—Query formulation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/0223—User address space allocation, e.g. contiguous or non contiguous base addressing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/13—File access structures, e.g. distributed indices
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/16—File or folder operations, e.g. details of user interfaces specifically adapted to file systems
- G06F16/164—File meta data generation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/18—File system types
- G06F16/182—Distributed file systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/23—Updating
- G06F16/2365—Ensuring data consistency and integrity
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0638—Organizing or formatting or addressing of data
- G06F3/064—Management of blocks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0655—Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
- G06F3/0659—Command handling arrangements, e.g. command buffers, queues, command scheduling
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0668—Interfaces specially adapted for storage systems adopting a particular infrastructure
- G06F3/0671—In-line storage system
- G06F3/0673—Single storage device
- G06F3/0674—Disk device
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
- G06F9/505—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
Definitions
- Embodiments of the present disclosure generally relate to the field of data management, and more specifically, to a method, a device and a computer program product for storing metadata.
- the storage device providers offer various types of storage devices to store massive data. Users can conveniently store and read the data via these storage devices.
- the amount of metadata for the user data also increases as user data increases. Therefore, the storage of the metadata becomes more and more important. Accordingly, the storage device providers design various storage structures for storing the metadata. On the account of the importance of the metadata, how to reasonably store the metadata has become an issue to be addressed.
- Embodiments of the present disclosure provide method, device and computer program product for storing metadata.
- a method of storing metadata comprises determining, based on a set of metadata items to be stored, a first sequence and a second sequence, wherein the first sequence and the second sequence each include a plurality of key-value pairs, each key-value pair including a metadata item from the set of metadata items and a keyword corresponding to the metadata item.
- the method also comprises causing a first co-routine to utilize available computing resources to process the first sequence.
- the method further comprises in response to an amount of computing resources available for the first co-routine to process the first sequence being below a first threshold, causing a second co-routine to process the second sequence, wherein the second co-routine is different from the first co-routine.
- an electronic device for storing metadata.
- the electronic device comprises a processor; and a memory having computer program instructions stored thereon, the processor executing the computer program instructions in a memory to control the electronic device to perform acts comprising: determining, based on a set of metadata items to be stored, a first sequence and a second sequence, wherein the first sequence and the second sequence each include a plurality of key-value pairs, each key-value pair including a metadata item from the set of metadata items and a keyword corresponding to the metadata item; causing a first co-routine to utilize available computing resources to process the first sequence; and in response to an amount of computing resources available for the first co-routine to process the first sequence being below a first threshold, causing a second co-routine to process the second sequence, wherein the second co-routine is different from the first co-routine.
- a computer program product is tangibly stored on a non-volatile computer-readable medium and comprises machine-executable instructions which, when executed, cause a machine to perform steps of the method according to the first aspect of the present disclosure.
- FIG. 1 illustrates a schematic diagram of an example environment 100 where device and/or method according to embodiments of the present disclosure can be implemented;
- FIG. 2 illustrates a schematic diagram 200 for showing storage positions of the metadata in accordance with embodiments of the present disclosure
- FIG. 3 illustrates a schematic diagram of a method 300 for storing metadata in accordance with embodiments of the present disclosure
- FIG. 4 illustrates a schematic diagram of an example 400 for describing co-routine operation in accordance with embodiments of the present disclosure
- FIG. 5 illustrates a schematic diagram of a procedure 500 for storing the metadata in accordance with embodiments of the present disclosure
- FIG. 6 illustrates a schematic diagram of a further procedure 600 for storing the metadata in accordance with embodiments of the present disclosure
- FIG. 7 illustrates a schematic diagram of an example 700 for describing storage positions of metadata in accordance with embodiments of the present disclosure
- FIG. 8 illustrates a schematic block diagram of an example device 800 for implementing embodiments of the present disclosure.
- the term “includes” and its variants are to be read as open-ended terms that mean “includes, but is not limited to.”
- the term “based on” is to be read as “based at least in part on.”
- the terms “one embodiment” and “this embodiment” are to be read as “at least one embodiment.”
- the terms “first”, “second” and so on can refer to same or different objects. The following text also can comprise other explicit and implicit definitions.
- the metadata from the memory usually is stored into an external storage device.
- a particular data structure is used in the external storage device to store the metadata for its read and acquisition.
- a single thread is often utilized to store the metadata.
- data blocks in data pages in the external storage device are firstly read into a storage engine and the metadata to be stored are then updated to the data blocks, which are subsequently flushed to the external storage device.
- the above procedure involves interactions with the external storage device.
- the processor cannot be fully exploited during the Input/Output procedure because the procedure is relatively slow. Therefore, the speed for storing the metadata into the external storage device is lowered and the utilization rate of the computing resources is also reduced.
- multithreading In order to increase the speed for storing the metadata to the external storage device, multithreading is usually employed to parallel processing the storage of the metadata.
- the parallel threads compete for resources during data processing, it is required to lock the data pages to be processed by two threads for the sake of data consistency.
- the competition causes the computing resources to excessively process these transactions, which accordingly reduces the benefits brought by parallel processing. Therefore, parallel multithreading also has the problems like low processor utilization rate and slow storage speed.
- the present disclosure proposes a method for storing metadata, in which co-routines are adopted to process the metadata to be stored.
- a plurality of metadata to be stored and keywords of the metadata are divided into a plurality of sequences, which are subsequently allocated to a plurality of co-routines for processing.
- a data block associated with a keyword of a first metadata in the sequence is identified as a critical data block.
- the co-routine updates the data block with metadata and keywords of the metadata, and stores non-critical data blocks into the external storage device. After completing the storage of the non-critical data blocks, the co-routines uniformly store the critical data blocks.
- a further co-routine is called via this method when a co-routine inputs/outputs the data, such that the co-routine can fully exploit the computing resources, which enhances the efficiency of storing the metadata to the external storage device and further increases the speed of data processing of the entire storage system.
- FIG. 1 graphically illustrates a schematic diagram of an example environment 100 where device and/or method according to embodiments of the present disclosure can be implemented.
- the example environment 100 includes a device 102 , which is provided for obtaining the metadata for the user data and storing the metadata of the user data on a memory 104 .
- the device 102 can be implemented as any types of devices, including but not limited to, mobile phone (e.g., smartphone), laptop computer, Portable Digital Assistant (PDA), electronic book (e-book) reader, portable game console, portable media player, game machine, Set-Top-Box (STB), smart television (TV), personal computer, on-board computer (such as navigation unit) and the like.
- mobile phone e.g., smartphone
- laptop computer Portable Digital Assistant (PDA)
- PDA Portable Digital Assistant
- e-book electronic book
- portable game console portable media player
- game machine Portable Digital Assistant
- STB Set-Top-Box
- TV smart television
- personal computer personal computer
- on-board computer such as navigation unit
- the device 102 includes the memory 104 for saving the metadata.
- the metadata in the memory 104 are transferred to a storage device 108 .
- the metadata is stored inside the memory 104 in the form of key-value pairs.
- the values in the key-value pairs correspond to the metadata and the keys are keywords acquired from the metadata.
- the key-value pairs stored in the memory 104 are sorted by the size of the key and then stored. The above examples are provided only for the purpose of explaining the present disclosure, rather than specifically restricting the present disclosure.
- the metadata can be stored within the memory 104 in any suitable ways.
- the device 102 also includes a storage engine 106 , which is used to store the data inside the memory 104 to the storage device 108 .
- the storage engine 106 can perform a process consisting of a plurality of co-routines, such that the plurality of co-routines cooperates with each other in a non-preemptive way to complete the storage of the metadata.
- the storage engine 106 stores the metadata into the storage device 108 in various data structures.
- the data structure is a tree structure.
- the tree structure can be B+tree structure or B ⁇ tree structure. The above examples are provided only for the purpose of explaining the present disclosure, rather than restricting it.
- the metadata can be stored in any suitable data structures.
- the storage procedure which is depicted by taking the B+tree structure as an example, includes three phases: path searching, data block updating and data block refreshing.
- Path searching is to find the path from the root page down to the very leaf page that encompasses the searching keyword.
- the tree structure comprises root page 202 , index pages 204 and 206 and leaf pages 208 , 210 , 212 , 214 and 216 .
- the key 218 includes 0x10 pointing to the leaf page 208 ;
- the key 220 includes 0x30 pointing to the leaf page 210 ;
- key 222 includes 0x06 pointing to the leaf page 216 .
- Data block updating is to insert, update or delete the key values in the data block of the page. When the size of the data block is above or below a threshold, the pages are split or combined.
- the inserted data block is marked as dirty, e.g., pages 208 , 210 and 214 in FIG. 2 .
- Data block refreshing is to write the dirty data blocks into the pages of the disks and updates the address reference of the dirty data blocks in its parent page.
- the storage engine 106 can contain a processor, which includes but not limited to, hardware Central Processing Unit (CPU), Field Programmable Gate Array (FPGA), Complex Programmable Logical Device (CPLD), Application-Specific Integrated Circuit (ASIC), System-on-Chip (SOC) or combinations thereof.
- CPU Central Processing Unit
- FPGA Field Programmable Gate Array
- CPLD Complex Programmable Logical Device
- ASIC Application-Specific Integrated Circuit
- SOC System-on-Chip
- the storage device 108 is used for storing the metadata outputted from the device 102 and also can provide the metadata for the device 102 after receiving a read request sent from the device 102 .
- FIGS. 1 and 2 The schematic diagram of the example environment where the device and/or the method according to embodiments of the present disclosure can be implemented has been described above with reference to FIGS. 1 and 2 .
- a method for storing the metadata according to the present disclosure is described in details below with reference to FIGS. 3 and 4 .
- the device 102 forms the obtained metadata item and the keyword corresponding to the metadata item into a key-value pair for storage.
- the key-value pair stored in the memory 104 when meeting a predetermined condition, is stored in the storage device 108 . In one example, when the number of stored key-value pairs reaches a threshold number, the key-value pairs within the memory 104 are stored in the storage device 108 . In another example, if the memory 104 lacks sufficient storage space for storing the new key-value pairs, the key-value pairs within the memory 104 are stored in the storage device 108 .
- the above examples are provided only for the purpose of describing the present disclosure, rather than specifically restricting it. Accordingly, the key-value pairs within the memory 104 can be stored in the storage device 108 based on any suitable conditions.
- a first sequence and a second sequence are determined based on a set of metadata items to be stored, wherein the terms “first”, “second” and so on can refer to same or different objects.
- the first sequence and the second sequence are two of the plurality of sequences.
- Each sequence each includes a plurality of key-value pairs and each key-value pair contains one metadata item in the set and a keyword corresponding to the metadata item.
- a plurality of key-value pairs corresponding to the set of the metadata items is divided into a plurality of sequences, wherein the plurality of key-value pairs is sorted based on the keys therein.
- the above examples are provided only for the purpose of explaining the present disclosure, rather than specifically restricting it.
- the key-value pairs can be stored based on the requirements in any suitable ways.
- the plurality of key-value pairs corresponding to the set of the metadata items is averaged.
- the key-value pairs in each sequence are the same.
- the plurality of key-value pairs can be divided based on any suitable rules.
- the first co-routine is caused to use the available computing resources to process the first sequence.
- the first co-routine processes the key-value pairs of the first sequence to store them in the storage device 108 .
- the first co-routine can be any one of the plurality of co-routines. The above example is provided only for the purpose of describing the present disclosure, rather than specifically restricting it.
- an amount of the computing resources available for the first co-routine to process the first sequence are below a first threshold.
- the first co-routine processes the Input/Output operations associated with the first sequence for the storage device, it means that the computing resources utilized by the first co-routine are below a first predetermined threshold. In this situation, other co-routines can be called to process another sequence.
- the number of Input/Output operations associated with the first sequence processed by the first co-routine exceeds a threshold number, it is determined that the computing resources used by the first co-routine are below the first threshold.
- a second sequence is processed by a second co-routine different from the first co-routine at block 308 .
- the first co-routine is caused to cease processing the first sequence.
- FIG. 4 illustrates a schematic diagram of an example 400 for describing the co-routine operation in accordance with embodiments of the present disclosure.
- the first co-routine performs blocks 402 , 406 and 410 while the second co-routine performs the blocks 404 , 408 and 412 . If the first co-routine reads the data blocks in the pages from the storage device 108 (query page) at block 402 or block 406 , it means that the amount of the computing resources used by the first co-routine is below the first predetermined threshold, and the second co-routine is called for processing the sequences of other key-value pairs.
- the first co-routine writes data into the pages in the storage device 108 at block 410 , it demonstrates that the amount of the computing resources used by the first co-routine is below the first predetermined threshold and the second co-routine is called for processing the sequences of other key-value pairs.
- the second co-routine also calls the first co-routine for processing.
- the above examples are provided only for the purpose of describing the present disclosure, rather than specifically restricting it. At the presence of a plurality of sequences and a plurality of co-routines, any other suitable co-routine can be determined as the second co-routine based on the requirements or a preset rule.
- the second co-routine when the second co-routine is called for processing the key-value pair sequence, it is required to confirm whether the second co-routine has completed the Input/Output operations for the storage device 108 .
- the second co-routine can utilize the computing resources for data processing only when it has completed the Input/Output operations.
- the above examples are provided only for the purpose of describing the present disclosure, rather than specifically restricting it.
- the condition for executing the second co-routine can be set based on the requirements.
- the first co-routine can continue to process the first sequence. Therefore, the computing resources are rationally utilized by the call between the co-routines.
- the second co-routine can be called, to utilize the computing resources to process data.
- the above operation improves the utilization rate of the computing resources and accelerates the speed for storing the metadata into the external storage device, which accordingly enhances the data processing capability of the entire storage system.
- FIG. 5 illustrates a schematic diagram of a procedure 500 for storing the metadata in accordance with embodiments of the present disclosure.
- the co-routine receives an allocated sequence to be processed.
- the key-value pairs in the sequence are processed and stored into the storage device.
- An associated key-value pair in the storage device matching with a first key value pair in the first sequence is determined at block 502 .
- a first keyword in the first key-value pair matches a second keyword in the associated key value pair.
- the associated key-value pair is a key-value pair stored in the storage device and has the keyword in the first key-value pair, i.e., the first keyword is identical to the second keyword.
- the associated key-value pair is a key-value pair stored in the storage device and has a keyword close to the keyword in the first key-value pair, i.e., the first keyword is close to the second keyword.
- the key-value pair for the metadata item is stored in the form of a tree structure.
- the tree structure can be B+tree or B ⁇ tree.
- the associated key-value pair includes a key-value pair matching a key of the first key-value pair in the leaf page for storing the metadata and also includes a key-value pair matching the key of the first key-value pair in all index pages which have been gone through during the procedure of finding the leaf page.
- FIG. 7 the storage structure of the key-value pairs for the metadata items has been described by taking the B+ tree as the example.
- FIG. 7 there are a root page 702 , index pages 704 , 706 , 708 , 710 , 712 , 714 and 716 , and underlying leaf pages including leaf pages 718 , 720 and 722 .
- the leaf pages are used for storing the key-value pairs including the metadata.
- the root page 702 and the index pages 704 , 706 , 708 , 710 , 712 , 714 and 716 store an index mapping to the leaf page.
- Key-value pairs to be stored for the metadata in the computing device 102 can be divided into a plurality of sequences.
- the key-value pairs to be stored for the metadata items in FIG. 7 are divided into three sequences 724 , 726 and 728 .
- the above examples are provided only for the purpose of explaining the present disclosure, rather than restricting it. Any suitable number of sequences can be set based on the requirements.
- Associated key values matching the first key-value pair of the sequence 724 exist on the leaf page 718 , the index pages 708 and 704 and the root page 702 .
- Associated key-value pairs matching the first key value pair of the sequence 726 are present on the leaf page 720 , the index pages 710 and 704 and the root page 702 .
- Associated key-value pairs matching the first key value pair of the sequence 728 exist on the leaf page 722 , the index pages 714 and 706 and the root page 702 .
- a first set of key-value pair from the critical storage pages in the storage device is obtained at block 504 , wherein the critical storage pages includes the associated key-value pair.
- the page having the associated key-value pair is marked as critical page.
- the stored first set of key-value pairs is acquired from the above critical page.
- the co-routine processing the first sequence may not mark the page where the associated key-value pair belongs as critical page.
- pages 720 , 722 , 710 , 714 , 704 , 706 and 702 associated with the second sequence 726 and the third sequence 728 are marked as critical pages.
- the first set of key-value pairs is obtain from the pages 720 , 722 , 710 , 714 , 704 , 706 and 702 whereas the pages 718 and 708 associated with the first sequence 724 are not marked as critical pages.
- the first set of key-value pairs is updated with the first key-value pair at block 506 .
- the first set of key-value pairs is updated with the first key-value pair and the update operation includes insertion, modification and deletion.
- the co-routines when processing the metadata, do not adjust the size of the first set of key-value pair in the critical storage pages, i.e., not dividing or combining the first set of key-value pairs in the critical pages. Meanwhile, the co-routines do not store the first set of key-value pairs in the critical storage pages into the storage device 108 .
- first co-routine and the second co-routine complete the processing of the first sequence and the second sequence.
- the updated first set of key-value pairs is stored into the critical storage pages at block 510 .
- the size of the first set of key-value pairs is adjusted. In one example, it is judged whether the data amount of the updated first set of key-value pairs t for the critical storage pages is between the data amount of the first page and the data amount of the second page. If the data amount of the updated first set of key-value pairs exceeds the data amount of the first page, it is required to divide the updated first set of key-value pairs. If the data amount of the updated first set of key-value pairs is below the data amount of the second page, it is required to combine the key-value pair set.
- the above examples are provided only for the purpose of explaining the present disclosure, rather than specifically restricting it. Conditions for adjusting the size of the key-value pair set and a threshold amount for adjustment can be set based on the requirements.
- the updated key-value pair sets for the critical storage pages in each co-routine are stored into the storage device.
- the key-value pair set for the metadata is firstly stored into the leaf critical storage pages during the procedure of storing the key-value pair set. Additionally or alternatively, the position of the leaf critical storage page is different from the position when the key-value pair set is read. Afterwards, the address of the leaf critical page is returned to the storage engine 106 for updating a first-level index page associated with the leaf page. When the first-level index page associated with the leaf page is updated, the first-level index page is stored and a second-level index page is updated with the address of the first-level index page until all pages are stored.
- a shared key-value pair set which may be used by two co-routines, is stored in the memory 104 rather than being read from the external storage device respectively.
- This operation ensures the consistency of data processing. Moreover, the operation enables the co-routine to know that the key-value pair set has been processed by another co-routine, which avoids the inconsistency issue of information processing.
- a group of associated key-value pairs in the storage device matching the remaining key-value pairs in the first sequence other than the first key value pair are determined at block 602 .
- the keywords in the remaining key-value pairs respectively match with the keywords in the group of associated key-value pairs.
- the associated key-value pair can be determined from the key-value pair set stored in the memory 104 .
- a group of storage pages to be updated from the storage device are obtained, where the group of storage gapes includes a group of associated key-value pairs, and the group of storage pages to be updated excludes the critical storage pages. If the key-value pairs in the group of associated key-value pairs are located on the critical storage pages, new storage pages are no longer required. If the key-value pairs in the group of associated key-value pairs are not located on the critical storage pages, it is required to determine a group of storage pages to be updated.
- the key-value pair set obtained from the group of storage pages to be updated is updated with the remaining key-value pairs.
- the key-value pairs of the remaining key-value pairs not on the critical pages are used to update the key-value pair sets obtained from the group of storage pages to be updated.
- the critical pages are directly updated using the key-value pairs of the remaining key-value pairs on the critical pages.
- the updated key-value pair sets are respectively stored into a group of target storage pages.
- the updated key-value pair sets are not key-value pair sets of the critical pages, it can be judged whether the data amount of the updated key-value pair sets is between the data amount of the first page and the data amount of the second page after the update is finished. If the data amount of the updated key-value pair sets exceeds the data amount of the first page, it is required to divide the key-value pair sets. If the data amount of the updated key-value pair sets is below the data amount of the second page, it is required to combine the key-value pair sets.
- the above examples are provided only for the purpose of explaining the present disclosure, rather than specifically restricting it. Conditions for adjusting the size of the key-value pair set and a threshold amount for adjustment can be set based on the requirements.
- the updated key-value pair sets are stored in a queue. If the amount of the updated key-value pair sets exceeds a threshold amount, the updated key-value pair sets are flushed to the external storage device.
- the storage device is caused to utilize addresses of a group of target storage pages to update addresses of a group of storage pages to be updated.
- the corresponding metadata are stored in these target data pages.
- the addresses of the storage pages to be updated in the storage device are updated as addresses of the target storage pages.
- the updated key-value pair sets associated with the leaf page are firstly flushed to the target data pages during the procedure of storing the updated key-value pair sets.
- the addresses of the target data pages are returned to the storage engine 106 , which updates a first layer of an index page with the addresses and then stores the first layer of the index page. Afterwards, a second layer of the index page is updated with the address of the first layer of the index page and the storage pages at each layer are sequentially stored. The storage procedure is finally completed.
- the above storage procedure can store the data in the non-critical pages directly into the storage device without coordination between the co-routines, which accelerates the procedure of data storage and enhances the storage efficiency.
- FIG. 8 illustrates a schematic block diagram of an example device 800 for implementing embodiments of the present disclosure.
- the device 800 includes a central process unit (CPU) 801 , which can execute various suitable actions and processing based on the computer program instructions stored in the read-only memory (ROM) 802 or computer program instructions loaded in the random-access memory (RAM) 803 from a storage unit 808 .
- the RAM 803 can also store all kinds of programs and data required by the operations of the device 800 .
- CPU 801 , ROM 802 and RAM 803 are connected to each other via a bus 804 .
- the input/output (I/O) interface 805 is also connected to the bus 804 .
- a plurality of components in the device 800 is connected to the I/O interface 805 , including: an input unit 806 , such as keyboard, mouse and the like; an output unit 807 , e.g., various kinds of display and loudspeakers etc.; a storage unit 808 , such as disk, optical disk etc.; and a communication unit 809 , such as network card, modem, wireless transceiver and the like.
- the communication unit 809 allows the device 800 to exchange information/data with other devices via the computer network, such as Internet, and/or various telecommunication networks.
- methods 300 , 500 and 600 can be executed by the processing unit 801 .
- methods 300 , 500 and 600 can be implemented as computer software programs tangibly included in the machine-readable medium, such as storage unit 808 .
- the computer program can be partially or fully loaded and/or mounted to the device 800 via ROM 802 and/or communication unit 809 .
- the computer program is loaded to RAM 803 and executed by the CPU 801 , one or more actions of the above described methods 300 , 500 and 600 can be executed.
- the present disclosure provide a method, a device and/or a computer program product.
- the computer program product can include a computer-readable storage medium loaded with computer-readable program instructions for executing various aspects of the present disclosure.
- the computer-readable storage medium can be a tangible apparatus that maintains and stores instructions utilized by the instruction executing apparatuses.
- the computer-readable storage medium can be, but not limited to, such as electrical storage device, magnetic storage device, optical storage device, electromagnetic storage device, semiconductor storage device or any appropriate combinations of the above.
- the computer-readable storage medium includes: portable computer disk, hard disk, random-access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash), static random-access memory (SRAM), portable compact disk read-only memory (CD-ROM), digital versatile disk (DVD), memory stick, floppy disk, mechanical coding devices, punched card stored with instructions thereon, or a projection in a slot, and any appropriate combinations of the above.
- the computer-readable storage medium utilized here is not interpreted as transient signals per se, such as radio waves or freely propagated electromagnetic waves, electromagnetic waves propagated via waveguide or other transmission media (such as optical pulses via fiber-optic cables), or electric signals propagated via electric wires.
- the described computer-readable program instruction herein can be downloaded from the computer-readable storage medium to each computing/processing device, or to an external computer or external storage via Internet, local area network, wide area network and/or wireless network.
- the network can include copper-transmitted cable, optical fiber transmission, wireless transmission, router, firewall, switch, network gate computer and/or edge server.
- the network adapter card or network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in the computer-readable storage medium of each computing/processing device.
- the computer program instructions for executing operations of the present disclosure can be assembly instructions, instructions of instruction set architecture (ISA), machine instructions, machine-related instructions, microcodes, firmware instructions, state setting data, or source codes or target codes written in any combinations of one or more programming languages, wherein the programming languages consist of object-oriented programming languages, such as Smalltalk, C++ and the like, and traditional procedural programming languages, e.g., C language or similar programming languages.
- the computer-readable program instructions can be implemented fully on the user computer, partially on the user computer, as an independent software package, partially on the user computer and partially on the remote computer, or completely on the remote computer or server.
- the remote computer can be connected to the user computer via any type of networks, including local area network (LAN) and wide area network (WAN), or to the external computer (e.g., connected via Internet using the Internet service provider).
- state information of the computer-readable program instructions is used to customize an electronic circuit, e.g., programmable logic circuit, field programmable gate array (FPGA) or programmable logic array (PLA).
- the electronic circuit can execute computer-readable program instructions to implement various aspects of the present disclosure.
- the computer-readable program instructions can be provided to the processing unit of general-purpose computer, dedicated computer or other programmable data processing apparatuses to manufacture a machine, such that the instructions that, when executed by the processing unit of the computer or other programmable data processing apparatuses, generate an apparatus for implementing functions/actions stipulated in one or more blocks in the flow chart and/or block diagram.
- the computer-readable program instructions can also be stored in the computer-readable storage medium and cause the computer, programmable data processing apparatus and/or other devices to work in a particular manner, such that the computer-readable medium stored with instructions contains an article of manufacture, including instructions for implementing various aspects of the functions/actions stipulated in one or more blocks of the flow chart and/or block diagram.
- the computer-readable program instructions can also be loaded into computer, other programmable data processing apparatuses or other devices, so as to execute a series of operation steps on the computer, other programmable data processing apparatuses or other devices to generate a computer-implemented procedure. Therefore, the instructions executed on the computer, other programmable data processing apparatuses or other devices implement functions/actions stipulated in one or more blocks of the flow chart and/or block diagram.
- each block in the flow chart or block diagram can represent a module, a part of program segment or code, wherein the module and the part of program segment or code include one or more executable instructions for performing stipulated logic functions.
- the functions indicated in the block can also take place in an order different from the one indicated in the drawings. For example, two successive blocks can be in fact executed in parallel or sometimes in a reverse order dependent on the involved functions.
- each block in the block diagram and/or flow chart and combinations of the blocks in the block diagram and/or flow chart can be implemented by a hardware-based system exclusive for executing stipulated functions or actions, or by a combination of dedicated hardware and computer instructions.
Abstract
Description
Claims (20)
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811302936.4A CN111143232B (en) | 2018-11-02 | 2018-11-02 | Method, apparatus and computer readable medium for storing metadata |
CN201811302936.4 | 2018-11-02 |
Publications (2)
Publication Number | Publication Date |
---|---|
US20200142861A1 US20200142861A1 (en) | 2020-05-07 |
US11106628B2 true US11106628B2 (en) | 2021-08-31 |
Family
ID=70459584
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/442,318 Active 2039-12-11 US11106628B2 (en) | 2018-11-02 | 2019-06-14 | Method, device and computer program product for storing metadata |
Country Status (2)
Country | Link |
---|---|
US (1) | US11106628B2 (en) |
CN (1) | CN111143232B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11392547B2 (en) * | 2020-04-09 | 2022-07-19 | Micron Technology, Inc. | Using prefix-delete operations for data containers |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120317579A1 (en) * | 2011-06-13 | 2012-12-13 | Huan Liu | System and method for performing distributed parallel processing tasks in a spot market |
US20140075032A1 (en) * | 2012-09-07 | 2014-03-13 | Oracle International Corporation | Declarative and extensible model for provisioning of cloud based services |
US20140289225A1 (en) * | 2013-03-21 | 2014-09-25 | Nextbit Systems Inc. | Prioritizing downloading of image files |
US8938416B1 (en) * | 2012-01-13 | 2015-01-20 | Amazon Technologies, Inc. | Distributed storage of aggregated data |
US20150134602A1 (en) * | 2013-11-14 | 2015-05-14 | Facebook, Inc. | Atomic update operations in a data storage system |
US20150324406A1 (en) * | 2012-11-21 | 2015-11-12 | International Business Machines Corporation | Managing replicated data |
US9497136B1 (en) * | 2011-09-28 | 2016-11-15 | Emc Corporation | Method and system for providing usage metrics to manage utilzation of cloud computing resources |
US20170006135A1 (en) * | 2015-01-23 | 2017-01-05 | C3, Inc. | Systems, methods, and devices for an enterprise internet-of-things application development platform |
US20190227853A1 (en) * | 2016-09-30 | 2019-07-25 | Huawei Technologies Co., Ltd. | Resource Allocation Method, Related Device And System |
US20200110540A1 (en) * | 2016-09-13 | 2020-04-09 | Netapp, Inc. | Systems and Methods for Allocating Data Compression Activities in a Storage System |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104731569B (en) * | 2013-12-23 | 2018-04-10 | 华为技术有限公司 | A kind of data processing method and relevant device |
CN103902702B (en) * | 2014-03-31 | 2017-11-28 | 北京皮尔布莱尼软件有限公司 | A kind of data-storage system and storage method |
EP2955629B1 (en) * | 2014-06-11 | 2021-10-27 | Home Control Singapore Pte. Ltd. | System for installing new firmware on a small-memory device |
-
2018
- 2018-11-02 CN CN201811302936.4A patent/CN111143232B/en active Active
-
2019
- 2019-06-14 US US16/442,318 patent/US11106628B2/en active Active
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120317579A1 (en) * | 2011-06-13 | 2012-12-13 | Huan Liu | System and method for performing distributed parallel processing tasks in a spot market |
US9497136B1 (en) * | 2011-09-28 | 2016-11-15 | Emc Corporation | Method and system for providing usage metrics to manage utilzation of cloud computing resources |
US8938416B1 (en) * | 2012-01-13 | 2015-01-20 | Amazon Technologies, Inc. | Distributed storage of aggregated data |
US20140075032A1 (en) * | 2012-09-07 | 2014-03-13 | Oracle International Corporation | Declarative and extensible model for provisioning of cloud based services |
US20150324406A1 (en) * | 2012-11-21 | 2015-11-12 | International Business Machines Corporation | Managing replicated data |
US20140289225A1 (en) * | 2013-03-21 | 2014-09-25 | Nextbit Systems Inc. | Prioritizing downloading of image files |
US20150134602A1 (en) * | 2013-11-14 | 2015-05-14 | Facebook, Inc. | Atomic update operations in a data storage system |
US20170006135A1 (en) * | 2015-01-23 | 2017-01-05 | C3, Inc. | Systems, methods, and devices for an enterprise internet-of-things application development platform |
US20200110540A1 (en) * | 2016-09-13 | 2020-04-09 | Netapp, Inc. | Systems and Methods for Allocating Data Compression Activities in a Storage System |
US20190227853A1 (en) * | 2016-09-30 | 2019-07-25 | Huawei Technologies Co., Ltd. | Resource Allocation Method, Related Device And System |
Also Published As
Publication number | Publication date |
---|---|
CN111143232A (en) | 2020-05-12 |
US20200142861A1 (en) | 2020-05-07 |
CN111143232B (en) | 2023-08-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11194779B2 (en) | Generating an index for a table in a database background | |
US8396852B2 (en) | Evaluating execution plan changes after a wakeup threshold time | |
US10698831B2 (en) | Method and apparatus for data access | |
US10346381B2 (en) | Atomic update operations in a data storage system | |
US20120290575A1 (en) | Mining intent of queries from search log data | |
US10235685B2 (en) | Method and system for two-dimensional charting using live queries | |
US20120102453A1 (en) | Multi-dimensional objects | |
US20170220945A1 (en) | Enhancing robustness of pseudo-relevance feedback models using query drift minimization | |
US9501327B2 (en) | Concurrently processing parts of cells of a data structure with multiple processes | |
US11886410B2 (en) | Database live reindex | |
CN110765036B (en) | Method and device for managing metadata at a control device | |
US10984050B2 (en) | Method, apparatus, and computer program product for managing storage system | |
US20190325043A1 (en) | Method, device and computer program product for replicating data block | |
CN110389812B (en) | Method, apparatus, and computer-readable storage medium for managing virtual machines | |
KR20200094074A (en) | Method, apparatus, device and storage medium for managing index | |
US9626428B2 (en) | Apparatus and method for hash table access | |
US11106628B2 (en) | Method, device and computer program product for storing metadata | |
US11509662B2 (en) | Method, device and computer program product for processing access management rights | |
US10733218B2 (en) | System, method, and program for aggregating data | |
CN107193754B (en) | Method and apparatus for data storage for searching | |
US20170161522A1 (en) | Organizing key-value information sets into hierarchical representations for efficient signature computation given change information | |
US11435926B2 (en) | Method, device, and computer program product for managing storage system | |
US7987210B1 (en) | System for lightweight objects | |
US8656410B1 (en) | Conversion of lightweight object to a heavyweight object | |
US9223814B2 (en) | Scalable selection management |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: EMC IP HOLDING COMPANY LLC, MASSACHUSETTS Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:WU, AARON YURUN;WU, GARY JIALEI;SUN, AO;REEL/FRAME:049478/0663 Effective date: 20190507 |
|
FEPP | Fee payment procedure |
Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
AS | Assignment |
Owner name: CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH, NORTH CAROLINA Free format text: SECURITY AGREEMENT;ASSIGNORS:DELL PRODUCTS L.P.;EMC CORPORATION;EMC IP HOLDING COMPANY LLC;REEL/FRAME:050406/0421 Effective date: 20190917 |
|
AS | Assignment |
Owner name: THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS COLLATERAL AGENT, TEXAS Free format text: PATENT SECURITY AGREEMENT (NOTES);ASSIGNORS:DELL PRODUCTS L.P.;EMC CORPORATION;EMC IP HOLDING COMPANY LLC;REEL/FRAME:050724/0571 Effective date: 20191010 |
|
AS | Assignment |
Owner name: THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., TEXAS Free format text: SECURITY AGREEMENT;ASSIGNORS:CREDANT TECHNOLOGIES INC.;DELL INTERNATIONAL L.L.C.;DELL MARKETING L.P.;AND OTHERS;REEL/FRAME:053546/0001 Effective date: 20200409 |
|
AS | Assignment |
Owner name: THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS COLLATERAL AGENT, TEXAS Free format text: SECURITY INTEREST;ASSIGNORS:DELL PRODUCTS L.P.;EMC CORPORATION;EMC IP HOLDING COMPANY LLC;REEL/FRAME:053311/0169 Effective date: 20200603 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
AS | Assignment |
Owner name: EMC IP HOLDING COMPANY LLC, TEXAS Free format text: RELEASE OF SECURITY INTEREST AT REEL 050406 FRAME 421;ASSIGNOR:CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH;REEL/FRAME:058213/0825 Effective date: 20211101 Owner name: EMC CORPORATION, MASSACHUSETTS Free format text: RELEASE OF SECURITY INTEREST AT REEL 050406 FRAME 421;ASSIGNOR:CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH;REEL/FRAME:058213/0825 Effective date: 20211101 Owner name: DELL PRODUCTS L.P., TEXAS Free format text: RELEASE OF SECURITY INTEREST AT REEL 050406 FRAME 421;ASSIGNOR:CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH;REEL/FRAME:058213/0825 Effective date: 20211101 |
|
AS | Assignment |
Owner name: EMC IP HOLDING COMPANY LLC, TEXAS Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (050724/0571);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:060436/0088 Effective date: 20220329 Owner name: EMC CORPORATION, MASSACHUSETTS Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (050724/0571);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:060436/0088 Effective date: 20220329 Owner name: DELL PRODUCTS L.P., TEXAS Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (050724/0571);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:060436/0088 Effective date: 20220329 Owner name: EMC IP HOLDING COMPANY LLC, TEXAS Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (053311/0169);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:060438/0742 Effective date: 20220329 Owner name: EMC CORPORATION, MASSACHUSETTS Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (053311/0169);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:060438/0742 Effective date: 20220329 Owner name: DELL PRODUCTS L.P., TEXAS Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (053311/0169);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:060438/0742 Effective date: 20220329 |