WO2022199400A1 - 持久内存文件系统元数据的检索方法和装置、存储结构 - Google Patents
持久内存文件系统元数据的检索方法和装置、存储结构 Download PDFInfo
- Publication number
- WO2022199400A1 WO2022199400A1 PCT/CN2022/080367 CN2022080367W WO2022199400A1 WO 2022199400 A1 WO2022199400 A1 WO 2022199400A1 CN 2022080367 W CN2022080367 W CN 2022080367W WO 2022199400 A1 WO2022199400 A1 WO 2022199400A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- path
- storage
- retrieved
- hash value
- metadata
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 93
- 230000002085 persistent effect Effects 0.000 title claims abstract description 67
- 238000012545 processing Methods 0.000 claims abstract description 27
- 238000004590 computer program Methods 0.000 claims description 6
- 238000004422 calculation algorithm Methods 0.000 description 21
- 238000010586 diagram Methods 0.000 description 19
- 239000000203 mixture Substances 0.000 description 9
- 238000007726 management method Methods 0.000 description 5
- 238000012986 modification Methods 0.000 description 5
- 230000004048 modification Effects 0.000 description 5
- 230000006870 function Effects 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 2
- 238000013500 data storage Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000012216 screening Methods 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 238000013515 script Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/14—Details of searching files based on file metadata
- G06F16/148—File search processing
- G06F16/152—File search processing using file content signatures, e.g. hash values
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/13—File access structures, e.g. distributed indices
- G06F16/137—Hash-based
Definitions
- This application relates to, but is not limited to, the technical field of data processing.
- a storage device is a device used to store information, usually after digitizing the information and then storing it using electrical, magnetic or optical media.
- Persistent Memory (also known as non-volatile memory (NVM)) and Dynamic Random Access Memory (DRAM) have similar performance, and PMEM has Similar to the function of persistent storage of data on disk, more and more file systems store metadata in PMEM.
- NVM non-volatile memory
- DRAM Dynamic Random Access Memory
- the present application provides a method and device for retrieving metadata of a persistent memory file system, and a storage structure.
- the present application provides a method for retrieving metadata of a persistent memory file system, including: determining a hash value to be retrieved based on a path to be retrieved, and configuring the path to be retrieved to find a storage path for metadata; and the storage path hash value to determine the storage address corresponding to the path to be retrieved, the fingerprint database is configured to record the characteristics of the storage path hash value, and the storage path hash value is the hash value corresponding to the storage path of the metadata; The corresponding storage address obtains metadata.
- the present application provides a retrieval device for persistent memory file system metadata, including: a hash value determination module configured to determine a hash value to be retrieved based on a to-be-retrieved path, and the to-be-retrieved path is configured to find a storage path for metadata; an address The determining module is configured to determine the storage address corresponding to the path to be retrieved according to the hash value to be retrieved, the fingerprint database and the hash value of the storage path, the fingerprint database is configured to record the features of the hash value of the storage path, and the hash value of the storage path is a The hash value corresponding to the storage path of the data; the obtaining module is configured to obtain metadata from the storage address corresponding to the path to be retrieved.
- a hash value determination module configured to determine a hash value to be retrieved based on a to-be-retrieved path, and the to-be-retrieved path is configured to find a storage path for metadata
- an address The determining module is configured to determine the storage address corresponding to the path to
- the present application provides a metadata storage structure, which is applied to any method for retrieving metadata in a persistent memory file system in the present application.
- the storage structure includes: storage address data bits, configured to record storage addresses, and storage The address is configured to store metadata, and the storage address corresponds to the storage path; the storage path hash value data bit is configured to record the storage path hash value.
- the storage path hash value is a hash value determined based on the storage path. The hash value is configured to record the characteristics of the storage path; the fingerprint database data bit is configured to record the fingerprint database, and the fingerprint database is configured to record the characteristics of the hash value of the storage path.
- the present application provides an electronic device, comprising: one or more processors; a memory on which one or more programs are stored, and when the one or more programs are executed by the one or more processors, the one or more programs are processed
- the device implements any one of the persistent memory file system metadata retrieval methods in this application.
- the present application provides a readable storage medium, where a computer program is stored in the readable storage medium, and when the computer program is executed by a processor, any one of the methods for retrieving metadata of a persistent memory file system in the present application is implemented.
- FIG. 1 shows a schematic diagram of metadata search using the DHT algorithm in this application.
- FIG. 2 shows a schematic flowchart of a method for retrieving metadata of a persistent memory file system provided by the present application.
- FIG. 3 shows a schematic flowchart of a method for retrieving metadata of a persistent memory file system provided by the present application.
- FIG. 4 shows a schematic flowchart of a method for retrieving metadata of a persistent memory file system provided by the present application.
- FIG. 5 shows a schematic flowchart of a method for retrieving metadata of a persistent memory file system provided by the present application.
- FIG. 6 is a block diagram showing the composition of the metadata storage structure of the persistent memory file system provided by the present application.
- FIG. 7 is a block diagram showing the composition of the storage structure of each node provided by the present application.
- FIG. 8 is a block diagram showing the composition of the metadata storage structure of the persistent memory file system provided by the present application.
- FIG. 9 shows a block diagram of the composition of the apparatus for retrieving metadata of a persistent memory file system provided by the present application.
- FIG. 10 shows a block diagram of the composition of the apparatus for retrieving metadata of a persistent memory file system provided by the present application.
- FIG. 11 shows a schematic flowchart of the method for acquiring each data bit of the fingerprint database provided by the present application.
- FIG. 12 shows a schematic diagram of the storage format of the hash table of each preset node of the persistent memory file system provided by the present application.
- FIG. 13 shows a schematic flowchart of a method for retrieving metadata of a persistent memory file system provided by the present application.
- FIG. 14 shows a schematic structural diagram of a method for obtaining a storage address corresponding to a path to be retrieved provided by the present application.
- FIG. 15 shows a schematic flowchart of a method for obtaining a storage address corresponding to a path to be retrieved provided by the present application.
- FIG. 16 shows a schematic flowchart of a method for deleting metadata in a persistent memory file system provided by the present application.
- FIG. 17 shows a schematic flowchart of a method for adding metadata in a persistent memory file system provided by the present application.
- FIG. 18 shows a structural diagram of an exemplary hardware architecture of a computing device capable of implementing the method and apparatus for retrieving persistent memory file system metadata provided by the present application.
- DHT distributed Hash Table
- FIG. 1 shows a schematic diagram of metadata search using the DHT algorithm in this application.
- multiple nodes share a hash table.
- the hash value to be retrieved is determined according to the path to be retrieved, and then the value corresponding to the hash value to be retrieved is determined according to the to-be-retrieved path.
- the value range of the storage path hash value corresponding to node 1 is (0, 255)
- the value range of the storage path hash value corresponding to node 2 is (256, 511)
- the value range of the storage path hash value corresponding to node 4 is (768, 1023)
- the value range of the storage path hash value corresponding to node 3 is (512, 767)).
- the to-be-retrieved hash values determined according to different to-be-retrieved paths are the same, there is a hash conflict among multiple to-be-retrieved paths corresponding to the to-be-retrieved hash values.
- the calculated hash value to be retrieved is 465.
- a linked-list storage structure is used to store metadata, and when the number of files in the file system is large, hash collisions will frequently occur; moreover, the efficiency of using the existing linked address method to retrieve metadata is Inefficient, especially in the case of hash collisions, retrieving metadata is less efficient.
- FIG. 2 shows a schematic flowchart of a method for retrieving metadata of a persistent memory file system provided by the present application.
- the method for retrieving persistent memory file system metadata can be applied to a device for retrieving persistent memory file system metadata.
- the method for retrieving metadata of a persistent memory file system may include the following steps S201 to S203.
- step S201 the hash value to be retrieved is determined based on the to-be-retrieved path.
- the to-be-retrieved path is configured as a storage path for searching metadata, and the to-be-retrieved path may be the storage path corresponding to the file or the directory information of the file.
- the above is just an example for the to-be-retrieved path, which can be set according to the situation , and other unexplained paths to be searched are also within the protection scope of this application, and will not be repeated here.
- the hash value to be retrieved is a hash value obtained by calculating the path to be retrieved using a hash algorithm.
- the hash value to be retrieved is obtained by calculating the path to be retrieved using a hash algorithm.
- the hash value to be retrieved is represented by a hexadecimal number: 0xff3a 11fc ff32 ff33 ff34 ff35 ff36 ff37 ff38 ff39 ff3a ff3b ff3c ff3d ff10 0009.
- the hash algorithm includes: any one of Secure Hash Algorithm (SHA)-224, SHA-256 algorithm, SHA-384 algorithm and SHA-512 algorithm.
- SHA-256 algorithm means that a hash value of 256-bit binary numbers can be generated through the hash algorithm.
- the hash value to be retrieved is obtained, without maintaining complex paths, and different paths can be determined through different hash values, simplifying for the processing of the search path.
- a suitable hash algorithm for example, SHA-256 algorithm
- step S202 the storage address corresponding to the path to be retrieved is determined according to the hash value to be retrieved, the fingerprint database and the hash value of the storage path.
- the fingerprint database is configured to record the characteristics of the hash value of the storage path, and the hash value of the storage path is the hash value corresponding to the storage path of the metadata. Through the fingerprint database, the hash value of the storage path matching the hash value to be retrieved can be quickly found, which improves the search efficiency of the metadata.
- a partial segment of the hash value of the storage path is used as the feature of the hash value of the storage path, and the feature of the hash value of the storage path is extracted and stored in the fingerprint database.
- the storage path hash value can be quickly locked in a certain range, and then the corresponding hash value to be retrieved can be determined based on the hash value to be retrieved and the hash value of the storage path.
- the storage address (for example, when the hash value to be retrieved and the hash value of the storage path are the same, it is determined that the storage path corresponding to the hash value of the storage path is the storage address corresponding to the hash value to be retrieved), which improves the storage address to be retrieved.
- the retrieval speed of the path In other words, the approximate location of the path to be retrieved can be preliminarily screened through the fingerprint database, and then the hash value to be retrieved can be refined and analyzed to finally determine the storage address corresponding to the path to be retrieved, which further ensures the accuracy of the found storage address.
- step S203 metadata is obtained from the storage address corresponding to the path to be retrieved.
- the storage location corresponding to the metadata can be accurately determined, and then the metadata can be obtained from the storage address corresponding to the path to be retrieved. accuracy.
- the hash value to be retrieved is determined based on the path to be retrieved, and the path to be retrieved is configured with a storage path for searching metadata, which facilitates the processing of the path to be retrieved; based on the hash value to be retrieved, the fingerprint database and the storage path Hash value, to determine the storage address corresponding to the path to be retrieved, wherein the fingerprint database is configured to record the characteristics of the hash value of the storage path, and the hash value of the storage path is the hash value corresponding to the storage path of the metadata. Screen the storage address with the fingerprint database, and then use the hash value to be retrieved and the hash value of the storage path to determine the storage address, which can speed up the retrieval speed, improve the retrieval efficiency, and ensure the accuracy of the retrieval results.
- FIG. 3 shows a schematic flowchart of a method for retrieving metadata of a persistent memory file system provided by the present application.
- the method for retrieving persistent memory file system metadata provided by the present application may include the following steps S301 to S306 .
- step S301 the hash value to be retrieved is determined based on the to-be-retrieved path.
- step S301 is the same as step S201 described with reference to FIG. 2 , and details are not repeated here.
- step S302 the nodes to be retrieved are determined according to the hash value to be retrieved and the number of preset nodes.
- the node to be retrieved includes a storage item, the storage item includes at least one path group, and the path group includes a fingerprint database.
- the node to be retrieved stores a hash table
- the length of the hash table is 1M
- the hash table can store 1024*1024 storage items
- each storage item includes at least one path group
- each path group includes Fingerprint library and storage path and its corresponding metadata.
- the feature of storing the hash value of the path is recorded in the fingerprint database.
- the hash value to be retrieved includes the hash value corresponding to the node to be retrieved, the hash value corresponding to each preset node can be calculated by the number of preset nodes, and then the hash value to be retrieved is respectively associated with these preset nodes.
- the processing method of the path to be retrieved is simplified and the search speed of the node to be retrieved is accelerated.
- determining the node to be retrieved according to the hash value to be retrieved and the number of preset nodes includes: obtaining a node information value based on the hash value to be retrieved; using the node information value to take the remainder of the number of preset nodes Process to determine the node to be retrieved.
- the first 32 bits of the hash value to be retrieved are preset to represent the node information value. If the hash value to be retrieved is represented as: 0xff3a 11fc ff32 ff33 ff34 ff35 ff36 ff37 ff38 ff39 ff3a ff3b ff3c ff3d ff10 0009, the to-be-retrieved hash value can be extracted The first 32 bits of the hash value (i.e., 0xff3a 11fc), then convert 0xff3a 11fc to the decimal number 4281995772. If the preset number of nodes is set to 4, then by taking the remainder of 4 by 4281995772, it can be determined that the remainder is 0, which means that the node to be retrieved is node 0.
- the node to be retrieved can be quickly determined, the retrieval speed of the node to be retrieved can be greatly improved, the maintenance and processing of the to-be-retrieved path can be avoided, and the processing process can be simplified.
- step S303 the storage item corresponding to the path to be retrieved is determined according to the hash value to be retrieved.
- the hash value to be retrieved may further include the information value of the storage item corresponding to the path to be retrieved.
- the last 20 bits of the hash value to be retrieved are used to represent the number of the storage item corresponding to the path to be retrieved.
- the number of the storage item representing the path to be retrieved is 9, that is, the path to be retrieved is stored In the ninth storage entry of node 0. Further specify the storage location of the path to be retrieved.
- step S304 the fingerprint database corresponding to each path group in the storage item is obtained.
- the storage item includes at least one path group, and each path group includes a fingerprint database. Because the characteristics of the storage path hash value are recorded in the fingerprint database, the fingerprint database of each path group in the storage item is obtained, that is, the storage path hash corresponding to each storage path in each path group in the storage item is obtained. value characteristics.
- step S305 the storage address corresponding to the path to be retrieved is determined according to the hash value to be retrieved, the fingerprint database and the hash value of the storage path.
- determining the storage address corresponding to the path to be retrieved according to the hash value to be retrieved, the fingerprint database and the hash value of the storage path includes: according to the hash value to be retrieved and the fingerprint database corresponding to the path group in the storage item A candidate storage path is determined; according to the hash value of the candidate storage path and the hash value to be retrieved, the storage address corresponding to the to-be-retrieved path is determined.
- the candidate storage path determined according to the hash value to be retrieved and the fingerprint database may be one storage path or multiple storage paths.
- the candidate storage paths may be obtained in a parallel manner, or may be obtained in a serial manner.
- candidate storage paths are obtained in a parallel manner. If the storage item includes 3 path groups, and each path group includes a fingerprint database, each fingerprint database corresponding to the 3 path groups can be obtained first, that is, fingerprint database 1, fingerprint database 2 and fingerprint database 3; The hash values to be retrieved are compared with the three fingerprint databases respectively to determine whether each path group includes candidate storage paths. If it is determined that both fingerprint database 2 and fingerprint database 3 include candidate storage paths, then fingerprint database 2 can be corresponding to the same. The candidate storage path 1 and the candidate storage path 2 corresponding to the fingerprint database 3 are extracted to facilitate the processing in the next step. That is, the hash value to be retrieved is compared with the candidate storage path 1 and the candidate storage path 2 respectively, and the storage address corresponding to the final to-be-retrieved path is determined.
- the candidate storage paths are obtained in a serial manner. If the storage item includes 3 path groups, and each path group includes a fingerprint database, the fingerprint database of each path group can be processed in turn. For example, first process the fingerprint database 1 corresponding to the path group 1 to determine whether there is a candidate storage path in the path group 1, if not, continue to process the fingerprint database 2 corresponding to the path group 2; if it exists, extract the The candidate storage path 1 in the path group 1, and then compare the hash value of the candidate storage path 1 with the hash value to be retrieved, to determine whether the hash value to be retrieved and the hash value of the candidate storage path 1 are the same, in If it is determined that the hash value to be retrieved is the same as the hash value of the candidate storage path 1, it is determined that the candidate storage path 1 is consistent with the path to be retrieved, and the storage address corresponding to the path to be retrieved is determined to be obtained; If the hash value of the candidate storage path 1 is different from the hash value, it is determined that the candidate storage
- preliminarily screening the fingerprint database corresponding to the path group in the storage item it can be preliminarily determined in which path group the path to be retrieved may exist, and then the hash value of the candidate storage path is extracted, and the hash value to be retrieved is processed again. Screen again to determine the storage address corresponding to the path to be retrieved. Through two-level hash value processing, hash collision can be avoided to the greatest extent, and the storage address corresponding to the path to be retrieved can be found.
- determining the candidate storage path according to the to-be-retrieved hash value and the fingerprint library corresponding to the path group in the storage item includes: XOR processing the fingerprint library corresponding to the to-be-retrieved hash value and the path group in the storage item , to determine the candidate storage path.
- XOR processing means to compare two numbers, and determine the XOR result according to the comparison result. For example, if two numbers are set as A and B, if A and B are not the same, the XOR result is true (can be It is represented by "1"); if A and B are the same, the XOR result is false (it can be represented by "0"). Both the hash value to be retrieved and the fingerprint library can be represented by binary numbers.
- an XOR result is obtained. If the XOR result is that the part of the feature of the hash value to be retrieved is the same as the part of the feature of the fingerprint database, It means that the path group corresponding to the fingerprint database includes candidate storage paths. For example, the fingerprint database includes features of 16 different storage paths. If some features of the hash value to be retrieved are the same as the features of the fourth storage path in the fingerprint database, it means that the fourth storage path in the fingerprint database The path is a candidate storage path. Determining candidate storage paths based on XOR processing can speed up retrieval of storage paths.
- determining the storage address corresponding to the to-be-retrieved path according to the hash value of the candidate storage path and the to-be-retrieved hash value includes: comparing the to-be-retrieved hash value and the hash value of the candidate storage path to determine the to-be-retrieved hash value The storage address corresponding to the path.
- the candidate storage path is the to-be-retrieved path to be searched, and the storage address corresponding to the to-be-retrieved path is obtained. If it is determined that the hash value to be retrieved is different from the hash value of the candidate storage path, it means that the candidate storage path is not the to-be-retrieved path to be searched, and the hash value to be retrieved needs to be compared with other candidate storage paths to determine the to-be-retrieved path. retrieve the storage address corresponding to the path.
- the storage path that is the same as the path to be retrieved can be quickly screened out from the candidate storage paths, and then the storage address corresponding to the path to be retrieved can be determined according to the storage path, Speed up the retrieval speed of the storage address corresponding to the path to be retrieved.
- step S306 metadata is obtained from the storage address corresponding to the path to be retrieved.
- step S306 is the same as step S203 described with reference to FIG. 2 , and details are not repeated here.
- the hash value to be retrieved is determined based on the path to be retrieved, and the path to be retrieved is configured with a storage path for searching metadata, so as to facilitate the processing of the path to be retrieved; according to the hash value to be retrieved and the number of preset nodes, Determine the nodes to be retrieved, and initially screen the nodes to be retrieved that may be stored in the to-be-retrieved path to avoid hash conflicts; obtain the fingerprint database corresponding to each path group in the storage item, and determine the storage item corresponding to the to-be-retrieved path according to the hash value to be retrieved, Further process the hash value to be retrieved, determine the storage item corresponding to the path to be retrieved, and speed up the retrieval of metadata; determine the storage address corresponding to the path to be retrieved according to the hash value to be retrieved, the fingerprint database and the hash value of the storage path , wherein the fingerprint database is configured to record the characteristics of the hash value of the storage path, and the hash value of the storage path
- the path group further includes bitmap information; after determining the candidate storage path according to the hash value to be retrieved and the fingerprint library corresponding to the path group in the storage item in step S305, the method further includes: according to the bitmap information, obtain the free paths in the candidate storage paths, the free paths correspond to the free memory addresses, and the storage state of the free memory addresses is the free state.
- the bitmap information is configured to identify whether there is an idle path in the path group.
- the bitmap information may be represented by binary numbers, and each binary number indicates whether each storage path in the path group is a free path.
- the bitmap information is represented by a 16-bit binary number, which can indicate that the path group includes 16 storage paths. If the 5th bit of the bitmap information is "0", it indicates that the path group contains 16 storage paths. Each storage path is an idle path; if the 8th bit of the bitmap information is "1", it means that the eighth storage path in the path group is not an idle path, that is, the storage address corresponding to the eighth storage path has been stored. There is metadata.
- the free paths in the candidate storage paths can be quickly and accurately determined through the bitmap information, the free storage addresses can be easily found, and the efficiency of data storage can be improved; in addition, the bitmap information can clearly determine the location of each storage path in the storage item.
- the storage status facilitates the management of each storage path and improves the management efficiency of the path.
- acquiring an idle path in the candidate storage paths according to the bitmap information includes: in the case of determining that there is an idle path in the candidate storage path, acquiring an idle path; determining that there is no idle path in the candidate storage path In this case, a new path group is generated, and free paths are obtained from the new path group.
- the new path group may include at least one storage path, and no metadata is stored in the storage address corresponding to the storage path, which can improve the storage capacity of data.
- bitmap information By using the bitmap information to quickly and accurately find the position of the free path in the candidate storage path under different circumstances, it is convenient to use the free path to store the metadata later, and the storage efficiency of the metadata is improved.
- the method further includes: acquiring the metadata to be stored; and storing the metadata to be stored to a free storage address corresponding to the free path.
- the metadata to be stored is the metadata that is expected to be stored in the persistent memory file system, and the metadata to be stored is stored in the idle storage address corresponding to the idle path to ensure the security of the metadata to be stored;
- the retrieval speed of the metadata to be stored can be accelerated by the retrieval methods in the various embodiments of the present application.
- the storage item further includes the number of path groups; after storing the metadata to be stored in the free storage address corresponding to the free path, the method further includes: updating the number of path groups in the storage item and the corresponding free storage address The bitmap information in the path group and the fingerprint library in the path group corresponding to the free storage address.
- the idle path is an idle path obtained by generating a new path group and obtained from the new path group, after storing the metadata to be stored in the idle storage address corresponding to the idle path, it is necessary to update the idle path.
- the number of path groups in the storage item corresponding to the new path group, as well as the bitmap information and fingerprint library in the new path group, are convenient for subsequent searching of the metadata to be stored.
- the idle path is an idle path obtained when it is determined that there is an idle path in the candidate storage path, after the metadata to be stored is stored in the idle storage address corresponding to the idle path, the path group corresponding to the candidate storage path needs to be updated
- the bitmap information and fingerprint library in the stored item do not need to be updated to the number of path groups in the stored item.
- FIG. 4 shows a schematic flowchart of a method for retrieving metadata of a persistent memory file system provided by the present application.
- the method for retrieving persistent memory file system metadata provided by the present application may include the following steps S401 to S405.
- step S401 the hash value to be retrieved is determined based on the to-be-retrieved path.
- step S402 the storage address corresponding to the path to be retrieved is determined according to the hash value to be retrieved, the fingerprint database and the hash value of the storage path.
- step S403 metadata is obtained from the storage address corresponding to the path to be retrieved.
- steps S401 to S403 are the same as steps S201 to S203 described with reference to FIG. 2 , and are not repeated here.
- step S404 the metadata in the storage address corresponding to the to-be-retrieved path and the corresponding storage path hash value are deleted to obtain the deleted path group.
- the storage path hash value corresponding to the third storage path needs to be deleted, and The metadata in the storage address corresponding to the third storage path is deleted to obtain the updated second path group (that is, the path group after the deletion of the third storage path).
- step S405 the fingerprint database and bitmap information corresponding to the deleted path group are updated.
- both the fingerprint database and the bitmap information include the relevant information of the storage path. After the metadata in a certain storage path and the corresponding storage path hash value are deleted, the fingerprint corresponding to the deleted path group needs to be deleted.
- the library and bitmap information are updated synchronously to avoid errors in the stored information.
- the bitmap and fingerprint database corresponding to the second path group need to be updated. If the bitmap of the second path group is 0xffdf (that is, represented by binary numbers as 1111111111011111), then the bitmap of the second path group in the ninth item of the hash table of node 0 after the update is 0xdfdf (that is, It is expressed in binary numbers as 1101111111011111).
- the storage paths are first screened by the fingerprint database, and the candidate storage paths are quickly locked, which can speed up the retrieval speed and improve the retrieval efficiency; obtain metadata from the storage addresses corresponding to the paths to be retrieved to ensure the accuracy of the retrieval results; when After obtaining the metadata, delete the metadata in the storage address corresponding to the to-be-retrieved path and its corresponding storage path hash value to obtain the deleted path group, avoid unnecessary metadata occupying storage resources, and improve the utilization of storage resources ; Update the fingerprint database and bitmap information corresponding to the deleted path group to avoid errors in the storage information, and ensure that the desired metadata and its corresponding storage path can be obtained quickly and accurately when the metadata is processed subsequently.
- FIG. 5 shows a schematic flowchart of a method for retrieving metadata of a persistent memory file system provided by the present application.
- the method for retrieving metadata of a persistent memory file system may include the following steps S501 to S505.
- step S501 the hash value to be retrieved is determined based on the to-be-retrieved path, and the to-be-retrieved path configures a storage path for searching metadata.
- step S502 the storage address corresponding to the path to be retrieved is determined according to the hash value to be retrieved, the fingerprint database and the hash value of the storage path.
- step S503 metadata is obtained from the storage address corresponding to the path to be retrieved.
- steps S501 to S503 are the same as steps S201 to S203 described with reference to FIG. 2 , and details are not repeated here.
- step S504 the information to be modified is obtained.
- the to-be-modified information includes to-be-modified directory metadata and to-be-modified file metadata.
- the modified directory metadata includes any one or more of the number of directory entries in the preset directory, the name of each directory entry, and the type of each directory entry.
- the file metadata to be modified includes but is not limited to: the last modification time of the preset file, the file size of the preset file, the preset file including multiple file data segments, the number of file data segments, and the starting address of each file data segment. and its length.
- step S505 the metadata in the storage address corresponding to the path to be retrieved is modified according to the information to be modified.
- the metadata in the storage address corresponding to the path to be retrieved can be retrieved according to the last modification time of the preset file (for example, the last modification time of the preset file is updated, etc.). It is guaranteed that the metadata stored in the persistent memory file system is updated in real time to ensure the authenticity and reliability of the metadata.
- the characteristics of the hash value of the storage path are first screened through the fingerprint database, which can speed up the retrieval speed and improve the retrieval efficiency; obtain metadata from the storage address corresponding to the path to be retrieved to ensure the accuracy of the retrieval result;
- the metadata needs to be modified. After obtaining the information to be modified corresponding to the metadata, modify the metadata in the storage address corresponding to the path to be retrieved according to the information to be modified to ensure the authenticity and reliability of the metadata.
- FIG. 6 is a block diagram showing the composition of the storage structure of persistent memory file system metadata provided by the present application.
- the storage structure can be applied to the retrieval method of persistent memory file system metadata in the present application.
- the storage structure It includes the following data bits: storage address data bit 601, configured to record storage address, storage address configured to store metadata, storage address corresponding to storage path; storage path hash value data bit 602, configured to record storage path hash value , the storage path hash value is a hash value determined based on the storage path, and the storage path hash value is configured to record the characteristics of the storage path; the fingerprint database data bit 603 is configured to record the fingerprint database, and the fingerprint database is configured to record the storage path hash value characteristics.
- the storage structure further includes: a storage node, the storage node includes storage items, the storage items include at least one path group, and each path group includes fingerprint database data bits 603 .
- FIG. 7 shows the composition block diagram of the storage structure of each node in the present application.
- four nodes are shown, each of which stores a hash table (ie, the hash table of node 1, the hash table of node 2, the hash table of node 3, and the hash table of node 4 Hash table), and each hash table stores different storage items.
- the hash table of node 1 stores 6 storage items, and the amount of metadata stored in each storage item is also different ( For example, the first storage item in the hash table of node 1 stores 3 metadata, the second storage item stores 2 metadata, ..., the sixth storage item stores 1 metadata).
- the hash tables of different nodes are independent of each other, which can avoid hash conflicts between nodes. Moreover, each node's hash table stores multiple different storage items, and the corresponding hash value can be quickly found through the hash value to be retrieved. The storage item of the node, which improves the retrieval speed of metadata.
- the path group further includes: bitmap information data bits configured to record bitmap information, the bitmap information being configured to identify whether there is an idle path in the path group.
- the bitmap information is used to first determine whether there is an idle path in the path group, which is convenient for storing metadata to be stored, accelerates the storage speed of data, and ensures that metadata to be stored can be quickly stored in an appropriate storage path.
- the storage address is recorded by the storage address data bit
- the storage address is configured to store metadata
- the storage address corresponds to the storage path
- the processing of the retrieval path can be facilitated
- the fingerprint database is recorded by the fingerprint database data bit
- the fingerprint The library is configured to record the characteristics of the hash value of the storage path.
- the fingerprint library first filters the characteristics of the hash value of the storage path to obtain candidate storage paths and speed up the retrieval speed; use the storage path hash value data bit to record the storage path hash value.
- the storage path hash value is a hash value determined based on the storage path
- the storage path hash value is configured to record the characteristics of the storage path
- the storage path hash value is screened twice to further improve the retrieval speed and improve the retrieval speed. At the same time of efficiency, the accuracy of retrieval results is guaranteed.
- FIG. 8 shows a block diagram of the storage structure of the persistent memory file system metadata provided by the present application.
- the storage structure includes: a path group, and the path group includes a path group number, a bitmap, a fingerprint library, multiple A storage path hash value and its corresponding metadata.
- the bitmap is configured to identify whether there is an idle path in the path group
- the fingerprint library is configured to record the characteristics of the storage path hash value
- the storage path hash value is configured to record the characteristics of the storage path
- the storage path hash value is based on the storage path.
- the storage path hash value can be expressed as a 256-bit binary number.
- the binary storage path can be The hash value is converted into a hexadecimal number representation, one for each 16 bits, for a total of 16 grids.
- 16 sets of storage path hash values and their corresponding metadata can be stored in the path group.
- the 16 storage path hash values and their stored metadata may include: storage path 1 hash value and storage path 1 metadata, storage path 2 hash value and storage path 2 metadata, ... , store path 16 hash value and store path 16 metadata.
- the hash value of storage path 1 is the hash value determined based on storage path 1; the hash value of storage path 2 is the hash value determined based on storage path 2; ...; the hash value of storage path 16 is based on the hash value of storage path 16 Definite hash value.
- the bitmap may also be represented by 16-bit binary numbers, and each binary number is configured to represent whether each storage path in the path group is an idle path.
- the xth bit in the bitmap represents whether the xth storage path in the path group is an idle path, that is, whether the storage state of the idle storage address corresponding to the xth storage path is idle, and "0" indicates that it is in an idle state. "1" indicates the storage state, and x is an integer greater than or equal to 0 and less than or equal to 15.
- the amount of metadata actually stored in the path group can be quickly obtained by the number of "1"s present in the bitmap. If some metadata is deleted, the binary number in the bitmap corresponding to the deleted metadata needs to be updated to "0".
- the fingerprint database can also be represented by a 256-bit binary number, each 16-bit binary number is a group, and is configured to identify a feature of the hash value of a storage path in the path group, then the fingerprint database can store 16 storage paths. Characteristics of path hashes.
- the hash value corresponding to the candidate storage path can be directly extracted to obtain the candidate storage path. Metadata corresponding to the storage path. If there are multiple candidate storage paths, it is also necessary to compare the hash value corresponding to the candidate storage path with the hash value to be retrieved one by one to determine the storage path corresponding to the hash value to be retrieved, and then obtain the hash value to be retrieved corresponding metadata.
- a plurality of storage path hash values and their corresponding metadata are stored, and it can be quickly determined whether each storage path in the path group is an idle path through the bitmap information.
- the free path can be quickly located to speed up the storage of metadata.
- the fingerprint library is used to record the characteristics of the storage path hash value, and the storage path hash value is configured to record the characteristics of the storage path.
- the fingerprint library is first screened to obtain candidate storage paths, and then the The candidate storage path can further locate the path to be retrieved, which can improve retrieval speed and efficiency, and at the same time, ensure the accuracy of retrieval results.
- FIG. 9 shows a block diagram of the composition of the device for retrieving metadata of persistent memory file system provided by the present application.
- the device includes: a hash value determination module 901, configured to determine the hash value to be retrieved based on the path to be retrieved , the path to be retrieved is configured to find the storage path of the metadata; the address determination module 902 is configured to determine the storage address corresponding to the path to be retrieved according to the hash value to be retrieved, the fingerprint database and the hash value of the storage path, and the fingerprint database is configured as The characteristics of the hash value of the storage path are recorded, and the hash value of the storage path is the hash value corresponding to the storage path of the metadata; the obtaining module 903 is configured to obtain the metadata from the storage address corresponding to the path to be retrieved.
- the hash value determination module determines the hash value to be retrieved based on the to-be-retrieved path, and the to-be-retrieved path is configured with a storage path for searching metadata, which is convenient for processing the to-be-retrieved path; the address determination module is used to determine the hash value to be retrieved based on the hash value to be retrieved.
- Value, fingerprint database and storage path hash value determine the storage address corresponding to the path to be retrieved, wherein the fingerprint database is configured to record the characteristics of the storage path hash value, and the storage path hash value is the hash value corresponding to the storage path of the metadata.
- Hash value filter the storage address through the hash value to be retrieved and the fingerprint database, and then use the hash value to be retrieved and the hash value of the storage path to determine the storage address, which can speed up the retrieval speed, improve the retrieval efficiency, and ensure the accuracy of the retrieval results. sex.
- FIG. 10 shows a block diagram of the composition of the apparatus for retrieving metadata of a persistent memory file system provided by the present application.
- the retrieval system includes: a path module 1010 , a secondary hash index module 1020 and a path group module 1030 .
- the path group module 1030 includes a fingerprint library module 1031 and a metadata module 1032 .
- the path module 1010 is configured to obtain a storage path, the storage path corresponds to a storage address, and the storage address is configured to store metadata.
- the storage path is "/root/a/b" or “/root/a/out/a.out”
- the storage address is the same as "/root/a/b” or "/root/a/out/a” .out” corresponds to the address in the persistent memory file system. Due to the complex format and different lengths of the storage paths, it is inconvenient to manage.
- the storage path hash value can uniquely identify the storage path, and then the storage address of the metadata can be uniquely determined according to the storage path hash value, which is convenient for storage
- the management of paths and their corresponding metadata improves the management efficiency of storage paths.
- the secondary hash index module 1020 is configured to store two levels of storage formats, namely, a node-level storage format and a hash table-level storage format.
- the first 32 binary digits in the storage path hash value represent the number of preset nodes (countNode) in the persistent memory file system.
- countNode the number of preset nodes
- extract the first 32 binary digits of hashUnique convert the 32 binary digits into decimal digits (data1), and use data1 to perform remainder processing on countNode to obtain the node to be retrieved. If the node to be retrieved is in the persistent memory file system, it can be determined that further data retrieval is performed in the node to be retrieved.
- the last 20 binary digits in hashUnique are used to represent the number corresponding to the storage item (for example, the jth storage item in the hash table of the i-th preset node, i and j is an integer greater than or equal to 1), wherein the storage item is a storage item in the hash table of the node to be retrieved in the persistent memory file system.
- the storage item is a storage item in the hash table of the node to be retrieved in the persistent memory file system.
- the secondary hash index module By using the secondary hash index module to store the metadata, the frequent occurrence of hash collisions can be avoided, and the hash collisions among various storage items can be reduced.
- the path group module 1030 is configured to obtain each data bit in the fingerprint library through the fingerprint library module 1031, and determine candidate storage paths according to the fingerprint library and the hash value to be retrieved, wherein the candidate storage paths may be one or more. Further, the hash value to be retrieved is determined according to the hash value to be retrieved and the candidate storage path, and then the storage path of the metadata is determined according to the hash value to be retrieved. Metadata is stored using the metadata module 1032.
- the metadata includes directory metadata and file metadata.
- the directory metadata includes any one or more of the number of directory entries in the preset directory, the name of each directory entry, and the type of each directory entry.
- the file metadata includes but is not limited to: the last modification time of the preset file, the file size of the preset file, the preset file including multiple file data segments, the number of file data segments, the starting address and length of each file data segment .
- Metadata can be fixed-length data or variable-length data. The above metadata is only for illustration, and can be set according to the situation. Other unexplained metadata are also within the protection scope of this application, and will not be repeated here.
- each preset node stores a hash table, where the hash table includes N path groups, where N is an integer greater than or equal to 0.
- Each path group includes: path group number, bitmap, fingerprint library, and 16 stored path hash values.
- path group module is used to manage the storage path corresponding to the metadata, which can improve the search efficiency of the metadata.
- FIG. 11 shows a schematic flowchart of the method for acquiring each data bit of the fingerprint database provided by the present application.
- the fingerprint library module 1031 can obtain each data bit of the fingerprint library by using the acquisition method shown in FIG. 11 .
- each storage path hash value is a 256-bit binary number.
- the binary storage path hash value can be converted into a hexadecimal number representation. 16 bits are a grid, a total of 16 grids. Extract the first cell (ie, the 0-15th bit) of the first storage path hash value as the first cell (ie, the 0-15th bit) of the fingerprint library, and extract the second storage path hash value.
- the 16th-31st As the 2nd cell (i.e., the 16th-31st) of the fingerprint library, ..., extract the 16th cell of the 16th storage path hash value (i.e., the 240th- 255) as the 16th cell (ie, bits 240-255) of the fingerprint library, thereby generating the fingerprint library of the path group.
- the storage path is indexed by a linked-list storage structure, and the corresponding indexing efficiency is low.
- the obtained storage structure that is, including a preset node, each node stores a storage item, and each storage item includes a storage path Hash value and its corresponding metadata
- the storage path is obtained through the path module, and the metadata is stored by using the node-level storage format and the hash table-level storage format in the secondary hash index module, which can avoid the frequent occurrence of hash conflicts and reduce the number of hash collisions.
- Hash collisions between stored items Obtain each data bit in the fingerprint database through the fingerprint database module in the path group module, and determine the candidate storage path according to the fingerprint database and the hash value to be retrieved; determine the hash value to be retrieved according to the hash value to be retrieved and the candidate storage path , and then determine the storage path of the metadata according to the hash value to be retrieved.
- the storage structure managed by the path group module can be used to improve the search efficiency of the metadata.
- FIG. 12 shows a schematic diagram of the storage format of the hash table of each preset node of the persistent memory file system provided by the present application.
- the persistent memory file system includes 4 preset nodes, namely node 0, node 1, node 2 and node 3.
- Each node stores a hash table
- each hash table includes 1048576 (1024*1024) storage items
- each storage item includes at least one path group.
- the ninth storage item of node 4 includes 3 path groups, ie, path group 1, path group 2, and path group 3.
- the ninth storage item stores 43 storage paths.
- path group 1 stores metadata in 16 different storage paths
- path group 2 stores metadata in 16 different storage paths
- path group 3 stores metadata in 16 different storage paths.
- FIG. 13 shows a schematic flowchart of a method for retrieving metadata of a persistent memory file system provided by the present application.
- the retrieval method includes the following steps S1310 to S1350.
- step S1310 a retrieval instruction of metadata is obtained.
- the retrieval instruction includes a path to be retrieved, and the metadata stored under the path to be retrieved may be file metadata or directory metadata.
- the path to be retrieved is "/test/fs_for_install/dpfs-2.0/fs/a.out".
- step S1320 the path module 1010 is used to perform a hash operation on the path to be retrieved using the SHA256 algorithm to obtain a hash value of the path to be retrieved.
- the hash value of the path to be retrieved may be represented by a 256-bit binary number, for example, the hash value of the path to be retrieved is marked as hashUnique. For readability, the 256-bit binary number can be converted to a hexadecimal number.
- Table 1 shows the structure of the hash value of the path to be retrieved. As shown in Table 1, the hash value of the path to be retrieved includes 16 cells, and each cell represents a 16-bit binary number (or, a 4-bit hexadecimal number corresponding to the 16-bit binary number).
- Table 1 The structure of the hash value of the path to be retrieved
- step S1330 the secondary hash index module 1020 is used to analyze the hash value to be retrieved, and it is determined that the path to be retrieved is located in the a-th storage item in the hash table of the i-th node.
- the persistent memory file system includes 4 preset nodes, that is, the preset number of nodes is 4.
- the secondary hash index module 1020 uses the secondary hash index module 1020 to obtain the last 20 binary digits of the hashUnique (for example, 0x0009), and then convert the last 20 binary digits into a decimal number a (for example, a is equal to 9), and finally determine The path to be retrieved is located in the ninth storage item in the hash table of node 0.
- a storage method in which a plurality of nodes share a hash table is used to store metadata.
- Different sections in the hash table correspond to different nodes, and there may be multiple storage paths corresponding to the calculated hash values to be retrieved.
- the retrieval efficiency may be low.
- each node stores a hash table to avoid the frequent occurrence of hash collisions; and the secondary hash index module processes the hash value to be retrieved, which can improve the efficiency of the to-be-retrieved path. Retrieval efficiency, efficiently obtain the metadata corresponding to the path to be retrieved.
- step S1340 the hash value of the path to be retrieved is compared with the fingerprint database in each path group in the ninth storage item in the hash table of node 0, and the storage address corresponding to the path to be retrieved is determined.
- a candidate storage path can be obtained, and the candidate storage path may include One storage path or multiple storage paths; then the hash value of the candidate storage path is compared with the hash value to be retrieved to determine the storage address corresponding to the to-be-retrieved path.
- FIG. 14 shows a schematic structural diagram of a method for obtaining a storage address corresponding to a path to be retrieved provided by the present application.
- the fingerprint database of a certain path group is compared with the hash value to be retrieved, and it can be determined that the fourth grid is the same (that is, the storage path corresponding to the fourth storage path in the path group)
- the segment feature of the hash value is the same as the fourth segment feature of the hash value to be retrieved); then, enter the second step, that is, extract the storage path hash value corresponding to the fourth storage path, and then store the fourth storage path.
- the hash value of the storage path corresponding to the path is compared with the hash value to be retrieved, and it can be determined that the two hash values are exactly the same, and the fourth storage path is the path to be retrieved.
- step S1350 metadata is obtained from the storage address corresponding to the path to be retrieved.
- the storage location corresponding to the metadata can be accurately determined, and then the metadata can be obtained from the storage address corresponding to the path to be retrieved. accuracy.
- the hash value to be retrieved is analyzed by the secondary hash index module 1020, and it is determined that the path to be retrieved is located in the a-th storage item in the hash table of the i-th node, that is, the hash value to be retrieved is analyzed.
- Refined analysis can significantly reduce hash collisions. Taking ten nodes as an example, if the retrieval method in this application is used, the probability that hash collisions can be generated is the first probability; The probability of a conflicting hash collision generated by the table retrieving metadata in multiple nodes is the second probability, and the first probability is only 10% of the second probability, which greatly reduces the hash collision.
- step S1340 may obtain the storage address corresponding to the path to be retrieved in the following manner.
- Fig. 15 shows a schematic flowchart of the method for obtaining a storage address corresponding to a path to be retrieved according to the present application. As shown in FIG. 15, step S1340 may include the following steps S1341 to S1346.
- step S1341 the number of path groups (GroupNum) in the ninth entry in the hash table of node 0 is read.
- step S1342 the fingerprint database in the path group k is extracted to obtain the kth fingerprint database.
- k is an integer greater than or equal to 0 and less than or equal to 2.
- Table 2 shows the structure of the first fingerprint database corresponding to the path group 1 in the ninth entry in the hash table of node 0.
- Table 3 shows the structure of the second fingerprint database corresponding to the path group 2 in the ninth item of the hash table of node 0.
- Table 2 The structure of the first fingerprint database corresponding to the path group 1 in the ninth item in the hash table of node 0
- Table 3 The structure of the second fingerprint database corresponding to the path group 2 in the ninth item of the hash table of node 0
- step S1343 perform a bitwise XOR operation on the kth fingerprint database and the hash value of the path to be retrieved to obtain the kth XOR result.
- the k-th XOR result can be represented as a 256-bit binary number.
- Table 4 shows the first XOR result
- Table 5 shows the second XOR result
- step S1344 it is judged whether there are all 0 data bits in the 16 cells of the k-th XOR result.
- step S1345 is executed; in the case of determining that there are no data bits of all 0s in the 16 cells of the k-th XOR result , and execute step S1342.
- step S1345 extract the storage path hash value corresponding to the data bits of all 0s in the k-th XOR result, and compare whether the storage path hash value is the same as the to-be-retrieved path hash value.
- Table 6 shows the storage path hash value corresponding to the third storage path of the ninth item of the second path group in the hash table of node 0.
- Table 6 The storage path hash value corresponding to the third storage path of the second path group of the ninth item of the hash table of node 0
- step S1346 Compare the hash value of the storage path in Table 6 with the hash value of the path to be retrieved in Table 1, and if it is determined that the hash value of the storage path is the same as the hash value of the path to be retrieved, perform step S1346; If the path hash value is different from the path hash value to be retrieved, step S1342 is executed.
- step S1346 the storage address corresponding to the path to be retrieved is obtained.
- step S1345 it can be determined that the hash value of the path to be retrieved (that is, the hash value shown in Table 1) corresponds to the third storage path of the second path group in the ninth item of the hash table of node 0
- the hash value of the storage path (that is, the hash value shown in Table 6) is completely consistent, and it is determined that the to-be-retrieved path has been found, that is, the to-be-retrieved path is the second path group of the ninth item of the hash table of node 0
- the retrieval method further includes: obtaining the information to be modified (for example, the last access time of a file is 12:01 on January 1, 2019). ); modify the metadata in the storage address corresponding to the path to be retrieved according to the information to be modified (for example, update the last access time of a certain file in the storage address corresponding to the path to be retrieved, for example, update the hash table of node 0 Metadata corresponding to the third storage path of the second path group of item 9).
- the metadata can be updated in real time and the accuracy of the metadata obtained by the user is guaranteed.
- FIG. 16 shows a schematic flowchart of a method for deleting metadata in a persistent memory file system provided by the present application.
- the method for deleting metadata in a persistent memory file system includes the following steps S1601 to S1606.
- step S1601 a retrieval instruction of metadata is obtained.
- step S1602 the path module 1010 is used to perform a hash operation on the path to be retrieved using the SHA256 algorithm to obtain a hash value of the path to be retrieved.
- step S1603 use the secondary hash index module 1020 to analyze the hash value to be retrieved, and determine that the path to be retrieved is located in the a-th storage item in the hash table of the i-th node.
- step S1604 the hash value of the path to be retrieved is compared with the fingerprint database in each path group in the ninth storage item in the hash table of node 0, and the storage address corresponding to the path to be retrieved is determined.
- steps S1601 to S1604 are the same as steps S1310 to S1340 described with reference to FIG. 13 , and are not repeated here.
- step S1605 delete the storage path hash value corresponding to the to-be-retrieved path and the metadata corresponding to the storage path hash value.
- step S1604 if it can be determined through step S1604 that the path to be retrieved is the third storage path of the second path group in the ninth item of the hash table of node 0, the storage path corresponding to the third storage path needs to be hashed value is deleted, and the metadata in the storage address corresponding to the third storage path is deleted.
- step S1606 modify the bitmap and fingerprint database of the path group corresponding to the path to be retrieved.
- the bitmap and fingerprint library corresponding to the second path group need to be modified.
- the 3rd position in the bitmap of the 2nd path group may be set to "0" to represent the 3rd position of the 2nd path group of the 9th item of the hash table of node 0
- a storage path is a free path. For example, if the bitmap of the second path group is 0xffdf (that is, expressed as 1111111111011111 in binary numbers), then the bitmap of the second path group of the ninth entry of the hash table of the modified node 0 is 0xdfdf( That is, it is represented by binary numbers as 1101111111011111).
- Table 7 The updated fingerprint library of the second path group of the ninth item of the hash table of node 0
- the processing efficiency of the to-be-retrieved path can be improved;
- the secondary hash index module is used to analyze the to-be-retrieved hash value to determine The path to be retrieved is located in the a-th storage item in the hash table of the i-th node, which can reduce hash collisions;
- the fingerprint database in each path group is compared to determine the storage address corresponding to the path to be retrieved, which can speed up the retrieval of metadata and improve retrieval efficiency; delete the metadata in the storage address corresponding to the path to be retrieved and its corresponding storage Path hash value to obtain the deleted path group, avoid unnecessary metadata occupying storage resources, and improve the utilization of storage resources; modify the bitmap and fingerprint database of the path group corresponding to the path to be retrieved to avoid errors in storage information, This ensures that the desired metadata and its corresponding storage path can be obtained quickly and accurately when the metadata is subsequently processed
- FIG. 17 shows a schematic flowchart of a method for adding metadata in a persistent memory file system provided by the present application. As shown in FIG. 17, in an exemplary embodiment, the method for adding metadata in a persistent memory file system includes the following steps S1701 to S1707.
- step S1701 an instruction for adding metadata is obtained.
- the adding instruction includes a path to be stored, and the metadata stored in the path to be stored may be file metadata or directory metadata.
- the path to be stored is "/test/fs_for_install/dpfs-2.0/fs/dpfs/scripts/local".
- step S1702 the path module 1010 is used to perform a hash operation on the path to be stored by using the SHA256 algorithm to obtain the hash value of the path to be stored.
- the hash value of the path to be stored can be represented by a 256-bit binary number. For ease of reading, the 256-bit binary number can be converted into a hexadecimal number.
- Table 8 shows the hash value of the path to be stored. structure. As shown in Table 8, the hash value of the path to be stored includes 16 cells, and each cell represents a 4-digit hexadecimal number.
- Table 8 The structure of the hash value to be stored
- step S1703 the secondary hash index module 1020 is used to analyze the hash value to be stored, and it is determined that the path to be stored is located in the a-th storage item in the hash table of the i-th node.
- the persistent memory file system includes 4 preset nodes, in other words, the number of preset nodes in the persistent memory file system is 4.
- Use the secondary hash index module 1020 to first obtain the first 32 binary digits of the hash value to be stored (for example, 0xff2fff30), then convert the first 32 binary digits to decimal digits (4281335600), and use the decimal digits (4281335600)
- the secondary hash index module 1020 uses the secondary hash index module 1020 to obtain the last 20 binary digits of the hash value to be stored (for example, 0x0009), and then convert the last 20 binary digits into a decimal number a (for example, a is equal to 9) ), and finally determine that the path to be stored is located in the ninth storage item in the hash table of node 0.
- each node stores a hash table to avoid the frequent occurrence of hash collisions; and the secondary hash index module 1020 processes the hash value to be stored, improving the retrieval efficiency of the path to be stored , the metadata corresponding to the path to be stored can be efficiently obtained.
- step S1704 according to the bitmap information of each path group of the a-th storage item, it is determined whether there is a free path in the a-th storage item in the a-th node's hash table.
- step S1705 is performed; it is determined that there is no idle path in the a-th storage item in the hash table of the i-th node. In the case of a path, step S1706 is executed.
- the bitmap information of each path group of the a-th storage item is different.
- bitmap information of the first path group of the a-th storage item is 0xffff (that is, expressed in binary numbers as 1111111111111111), it means that there is no free path in the first path group of the a-th storage item;
- the bitmap information of the second path group of the item storage item is 0xffdf, (that is, expressed as 11111111 1101 1111 in binary numbers), it means that the 11th storage path in the second path group of the a-th storage item is If there is an idle path, step S1706 is executed.
- step S1705 For example, by traversing the bitmap information of each path group of the a-th storage item, no free path is found, and it is determined that step S1705 needs to be performed.
- step S1705 a new path group is created in the a-th storage item, the first storage path of the newly-created path group is taken as an idle path, and the number of path groups of the a-th storage item is increased by 1.
- step S1706 update the bitmap information and fingerprint database of the path group in the a-th storage item corresponding to the free path.
- the bitmap information of the second path group of the a-th storage item is updated from 0xffdf to 0xffff, that is, the 11th bit is "1".
- write the 11th cell of the fingerprint database of the second path group of the a-th storage item that is, write the 160-175th bits of the hash value to be stored into the 160-175th bits of the fingerprint database.
- the storage paths in the storage items of the same hash table are organized by means of a linked list, wherein a linked list is a non-consecutive and non-sequential storage structure on a physical storage unit, and The logical order between the metadata stored using the linked list is achieved by the linking order of the pointers in the linked list.
- a linked list is a non-consecutive and non-sequential storage structure on a physical storage unit
- the logical order between the metadata stored using the linked list is achieved by the linking order of the pointers in the linked list.
- each path group includes a bitmap and a fingerprint database
- the storage status of each storage path of the path group can be quickly obtained through the bitmap (for example, a certain path group Whether the storage path is an idle path, etc.), and records the characteristics of the hash value of the storage path in the path group through the fingerprint database, the hash value to be retrieved corresponding to the path to be retrieved can be quickly found, and the retrieval speed is effectively improved.
- step S1707 the metadata corresponding to the path to be stored is stored to the free storage address corresponding to the free path in the path group of the a-th storage item.
- the metadata corresponding to the path to be stored is stored in the free storage address corresponding to the eleventh storage path of the ninth item of the second path group of the hash table of node 0.
- determining whether there is an idle path in a certain storage item of the hash table of a certain node through bitmap information can quickly locate the position of the idle path and speed up the storage speed of the metadata to be stored; further, the The metadata to be stored is stored in the idle storage address corresponding to the idle path, and the bitmap information and fingerprint library of the path group in the a-th storage item corresponding to the idle path are updated, so as to avoid storing the metadata to be stored in the persistent memory file system After that, the problem of the metadata to be stored cannot be found; through the updated bitmap information and fingerprint database, the storage path of the metadata to be stored can be quickly located, and the search speed of the metadata to be stored can be accelerated.
- FIG. 18 shows a structural diagram of an exemplary hardware architecture of a computing device capable of implementing the method and apparatus for retrieving persistent memory file system metadata provided by the present application.
- the computing device 1800 includes an input device 1801 , an input interface 1802 , a central processing unit 1803 , a memory 1804 , an output interface 1805 , and an output device 1806 .
- the input interface 1802, the central processing unit 1803, the memory 1804, and the output interface 1805 are connected to each other through the bus 1807, and the input device 1801 and the output device 1806 are respectively connected to the bus 1807 through the input interface 1802 and the output interface 1805, and then to the computing device 1800. connections to other components.
- the input device 1801 receives input information from the outside, and transmits the input information to the central processing unit 1803 through the input interface 1802; the central processing unit 1803 processes the input information based on the computer-executable instructions stored in the memory 1804 to generate Output information, store the output information temporarily or permanently in the memory 1804, and then transmit the output information to the output device 1806 through the output interface 1805; the output device 1806 outputs the output information to the outside of the computing device 1800 for the user to use.
- the computing device shown in FIG. 18 may be implemented as an electronic device, and the electronic device may include: a memory configured to store a program; a processor configured to execute the program stored in the memory to Performs the retrieval method of persistent memory file system metadata described herein.
- the computing device shown in FIG. 18 may be implemented as a persistent memory file system metadata retrieval system, and the persistent memory file system metadata retrieval system may include: a memory configured to store programs; A processor configured to execute a program stored in the memory to perform the method of retrieving persistent memory file system metadata described herein.
- the various embodiments of the present application may be implemented in hardware or special purpose circuits, software, logic, or any combination thereof.
- some aspects may be implemented in hardware, while other aspects may be implemented in firmware or software that may be executed by a controller, microprocessor or other computing device, although the application is not limited thereto.
- Embodiments of the present application may be implemented by the execution of computer program instructions by a data processor of a mobile device, eg in a processor entity, or by hardware, or by a combination of software and hardware.
- the computer program instructions may be assembly instructions, instruction set architecture (ISA) instructions, machine instructions, machine dependent instructions, microcode, firmware instructions, state setting data, or source code written in any combination of one or more programming languages or object code.
- ISA instruction set architecture
- the block diagrams of any logic flow in the figures of the present application may represent program steps, or may represent interconnected logic circuits, modules and functions, or may represent a combination of program steps and logic circuits, modules and functions.
- Computer programs can be stored on memory.
- the memory may be of any type suitable for the local technical environment and may be implemented using any suitable data storage technology such as, but not limited to, read only memory (ROM), random access memory (RAM), optical memory devices and systems (Digital Versatile Discs). DVD or CD disc) etc.
- Computer-readable media may include non-transitory storage media.
- the data processor may be of any type suitable for the local technical environment, such as, but not limited to, a general purpose computer, special purpose computer, microprocessor, digital signal processor (DSP), application specific integrated circuit (ASIC), programmable logic device (FGPA) and processors based on multi-core processor architectures.
- DSP digital signal processor
- ASIC application specific integrated circuit
- FGPA programmable logic device
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Library & Information Science (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
本申请提出一种持久内存文件系统元数据的检索方法和装置、存储结构,涉及数据处理技术领域。该方法包括:基于待检索路径确定待检索哈希值,待检索路径配置用于查找元数据的存储路径;依据待检索哈希值、指纹库和存储路径哈希值,确定待检索路径对应的存储地址,指纹库配置为记录存储路径哈希值的特征,存储路径哈希值是元数据的存储路径对应的哈希值;从待检索路径对应的存储地址获得元数据。
Description
相关申请的交叉引用
本申请要求于2021年3月23日提交给中国专利局的第202110310068.X号专利申请的优先权,其全部内容通过引用合并于此。
本申请涉及但不限于数据处理技术领域。
存储设备是用于储存信息的设备,通常是将信息数字化后再利用电、磁或光学等方式的媒体加以存储。持久化内存(Persistent Memory,PMEM)(也称为非易失性存储器(non-volatile memory,NVM))和动态随机存取存储器(Dynamic Random Access Memory,DRAM)具有相近的性能,并且,PMEM具有和磁盘一样的持久化存储数据的功能,越来越多的文件系统将元数据存放在PMEM中。
发明内容
本申请提供一种持久内存文件系统元数据的检索方法和装置、存储结构。
本申请提供一种持久内存文件系统元数据的检索方法,包括:基于待检索路径确定待检索哈希值,待检索路径配置用于查找元数据的存储路径;依据待检索哈希值、指纹库和存储路径哈希值,确定待检索路径对应的存储地址,指纹库配置为记录存储路径哈希值的特征,存储路径哈希值是元数据的存储路径对应的哈希值;从待检索路径对应的存储地址获得元数据。
本申请提供一种持久内存文件系统元数据的检索装置,包括:哈希值确定模块,配置为基于待检索路径确定待检索哈希值,待检索 路径配置用于查找元数据的存储路径;地址确定模块,配置为依据待检索哈希值、指纹库和存储路径哈希值,确定待检索路径对应的存储地址,指纹库配置为记录存储路径哈希值的特征,存储路径哈希值是元数据的存储路径对应的哈希值;获取模块,配置为从待检索路径对应的存储地址获得元数据。
本申请提供一种元数据的存储结构,该存储结构应用于本申请中的任意一种持久内存文件系统元数据的检索方法,该存储结构包括:存储地址数据位,配置为记录存储地址,存储地址配置为存储元数据,存储地址与存储路径相对应;存储路径哈希值数据位,配置为记录存储路径哈希值,存储路径哈希值是基于存储路径确定的哈希值,存储路径哈希值配置为记录存储路径的特征;指纹库数据位,配置为记录指纹库,指纹库配置为记录存储路径哈希值的特征。
本申请提供一种电子设备,包括:一个或多个处理器;存储器,其上存储有一个或多个程序,当一个或多个程序被一个或多个处理器执行,使得一个或多个处理器实现本申请中的任意一种持久内存文件系统元数据的检索方法。
本申请提供了一种可读存储介质,该可读存储介质存储有计算机程序,计算机程序被处理器执行时实现本申请中的任意一种持久内存文件系统元数据的检索方法。
图1示出本申请中的采用DHT算法进行元数据查找的示意图。
图2示出本申请提供的持久内存文件系统元数据的检索方法的流程示意图。
图3示出本申请提供的持久内存文件系统元数据的检索方法的流程示意图。
图4示出本申请提供的持久内存文件系统元数据的检索方法的流程示意图。
图5示出本申请提供的持久内存文件系统元数据的检索方法的流程示意图。
图6示出本申请提供的持久内存文件系统元数据的存储结构的组成方框图。
图7示出本申请提供的各个节点的存储结构的组成方框图。
图8示出本申请提供的持久内存文件系统元数据的存储结构的组成方框图。
图9示出本申请提供的持久内存文件系统元数据的检索装置的组成方框图。
图10示出本申请提供的持久内存文件系统元数据的检索装置的组成方框图。
图11示出本申请提供的指纹库的各个数据位的获取方法的流程示意图。
图12示出本申请提供的持久内存文件系统的各个预设节点的哈希表的存储格式示意图。
图13示出本申请提供的持久内存文件系统的元数据的检索方法的流程示意图。
图14示出本申请提供的获取待检索路径对应的存储地址的方法的结构示意图。
图15示出本申请提供的获得待检索路径对应的存储地址的方法的流程示意图。
图16示出本申请提供的在持久内存文件系统中删除元数据的方法的流程示意图。
图17示出本申请提供的在持久内存文件系统中增加元数据的方法的流程示意图。
图18示出能够实现根据本申请提供的持久内存文件系统元数据的检索方法和装置的计算设备的示例性硬件架构的结构图。
为使本申请的目的、技术方案和优点更加清楚明白,下文中将结合附图对本申请的实施方式进行详细说明。需要说明的是,在不冲突的情况下,本申请中的实施方式及实施方式中的特征可以相互任意 组合。
现有的元数据的存储算法多采用分布式哈希表算法(Distributed Hash Table,DHT),可保证分布式文件系统中的元数据尽可能均匀地分布在分布式文件系统的各节点。但是,在文件系统中的文件数量较多的情况下,易导致出现频繁的哈希冲突,而在出现哈希冲突时使用链式地址法查找元数据,导致数据查找效率低下。
图1示出本申请中的采用DHT算法进行元数据查找的示意图。如图1所示,多个节点共用一张哈希表,在对存储的元数据进行查找时,先根据待检索路径确定待检索哈希值,然后根据该待检索哈希值对应的取值范围,确定需要检索的节点(例如,节点1对应的存储路径哈希值的取值范围是(0,255),节点2对应的存储路径哈希值的取值范围是(256,511),节点4对应的存储路径哈希值的取值范围是(768,1023),节点3对应的存储路径哈希值的取值范围是(512,767))。当根据不同的待检索路径确定的待检索哈希值相同时,与该待检索哈希值对应的多个待检索路径之间存在哈希冲突。例如,计算获得的待检索哈希值为465,如图1所示,对应该待检索哈希值(465)有3条不同的存储路径(即,存储路径(path6、path7和path8),此时path6、path7和path8之间存在哈希冲突。同样的,待检索哈希值(565)对应有2条不同的存储路径(即,path10和path11),path10和path11之间也存在哈希冲突。
现有技术采用链表式的存储结构对元数据进行存储,在文件系统中的文件数量较多的情况下,哈希冲突会频繁发生;而且,采用现有的链式地址法检索元数据的效率低下,尤其在发生哈希冲突的情况下,检索元数据的效率更低。
图2示出本申请提供的持久内存文件系统元数据的检索方法的流程示意图。该持久内存文件系统元数据的检索方法可应用于持久内存文件系统元数据的检索装置。如图2所示,该持久内存文件系统元数据的检索方法可以包括以下步骤S201至S203。
在步骤S201,基于待检索路径确定待检索哈希值。
其中,待检索路径配置用于查找元数据的存储路径,待检索路 径可以是文件对应的存储路径,也可以是文件的目录信息,以上对于待检索路径仅是举例说明,可根据情况进行设定,其他未说明的待检索路径也在本申请的保护范围之内,在此不再赘述。待检索哈希值是利用哈希算法对待检索路径进行计算获得的哈希值。
例如,待检索路径为“/test/fs_for_install/dpfs-2.0/fs/a.out”,利用哈希算法对待检索路径进行计算获得待检索哈希值。例如,采用16进制数字表示该待检索哈希值为:0xff3a 11fc ff32 ff33 ff34 ff35 ff36 ff37 ff38 ff39 ff3a ff3b ff3c ff3d ff10 0009。
在一个实施方式中,哈希算法包括:安全散列算法(Secure Hash Algorithm,SHA)-224、SHA-256算法、SHA-384算法和SHA-512算法中的任意一种。其中,SHA-256算法表示通过该哈希算法可生成256位二进制数字的哈希值。
通过采用合适的哈希算法(例如,SHA-256算法)对待检索路径进行哈希运算,获得待检索哈希值,无需维护复杂的路径,通过不同的哈希值即可确定不同的路径,简化了对待检索路径的处理。
在步骤S202,依据待检索哈希值、指纹库和存储路径哈希值,确定待检索路径对应的存储地址。
其中,指纹库配置为记录存储路径哈希值的特征,存储路径哈希值是元数据的存储路径对应的哈希值。通过指纹库可快速的查找到与待检索哈希值相匹配的存储路径哈希值,提高了元数据的查找效率。
例如,将存储路径哈希值的部分片段作为该存储路径哈希值的特征,提取该存储路径哈希值的特征并保存到指纹库中。通过指纹库中的不同存储路径哈希值的特征,即可快速地将存储路径哈希值锁定在一定范围,再基于待检索哈希值和存储路径哈希值确定待检索哈希值对应的存储地址(例如,当待检索哈希值和存储路径哈希值相同的情况下,确定该存储路径哈希值对应的存储路径即为待检索哈希值对应的存储地址),提升了对待检索路径的检索速度。换言之,通过指纹库可初步筛查待检索路径的大概位置,再对待检索哈希值做精细化分析,最终确定该待检索路径对应的存储地址,进一步保证了查找到的存储地址的准确性。
在步骤S203,从待检索路径对应的存储地址获得元数据。
通过待检索路径对应的存储地址,可准确的确定元数据对应的存储位置,再从待检索路径对应的存储地址获得元数据,能够保证该元数据即为要查找的元数据,保证元数据的准确性。
在本申请中,通过基于待检索路径确定待检索哈希值,待检索路径配置用于查找元数据的存储路径,方便对待检索路径进行处理;依据待检索哈希值、指纹库和存储路径哈希值,确定待检索路径对应的存储地址,其中,指纹库配置为记录存储路径哈希值的特征,存储路径哈希值是元数据的存储路径对应的哈希值,通过待检索哈希值和指纹库对存储地址进行筛选,再利用待检索哈希值和存储路径哈希值确定存储地址,能够加快检索速度,提高检索效率,同时保证检索结果的准确性。
图3示出本申请提供的持久内存文件系统元数据的检索方法的流程示意图。如图3所示,本申请提供的持久内存文件系统元数据的检索方法可以包括以下步骤S301至S306。
在步骤S301,基于待检索路径确定待检索哈希值。
需要说明的是,步骤S301与参照图2所描述的步骤S201相同,在此不再赘述。
在步骤S302,依据待检索哈希值和预设节点的数量,确定待检索节点。
其中,待检索节点包括存储项,存储项包括至少一个路径组,路径组包括指纹库。
例如,待检索节点存储有一张哈希表,该哈希表的长度为1M,即该哈希表可存储1024*1024个存储项,每个存储项包括至少一个路径组,每个路径组包括指纹库和存储路径及其对应的元数据。指纹库中记录了存储路径哈希值的特征。通过以上存储结构,可避免哈希冲突的频繁发生,并方便对元数据的查找,提升查找效率。
例如,待检索哈希值包括待检索节点对应的哈希值,可通过预设节点的数量,计算每个预设节点对应的哈希值,然后将待检索哈希值分别与这些预设节点对应的哈希值进行对比,即可确定待检索节点 具体对应哪个预设节点,通过哈希值的对比,简化对待检索路径的处理方式,加快待检索节点的查找速度。
在一些实施方式中,依据待检索哈希值和预设节点的数量,确定待检索节点,包括:基于待检索哈希值获得节点信息值;使用节点信息值对预设节点的数量做取余处理,确定待检索节点。
例如,预先设定待检索哈希值的最前32位表征节点信息值,若待检索哈希值表示为:0xff3a 11fc ff32 ff33 ff34 ff35 ff36 ff37 ff38 ff39 ff3a ff3b ff3c ff3d ff10 0009,可提取该待检索哈希值的最前32位(即,0xff3a 11fc),然后将0xff3a 11fc转换为十进制数字4281995772。若设定预设节点的数量为4,则通过4281995772对4取余数,可确定该余数为0,即表征待检索节点为节点0。
通过以上计算,可快速确定待检索节点,极大地提升对待检索节点的检索速度,同时避免对待检索路径的维护和处理,简化处理过程。
在步骤S303,依据待检索哈希值确定待检索路径对应的存储项。
其中,待检索哈希值还可以包括待检索路径对应的存储项的信息值。
例如,采用待检索哈希值的最后20位来表征待检索路径对应的存储项的编号。通过提取待检索哈希值的最后20位(即,0x0009),然后再将0x0009转换为十进制数字(即,9),则表征待检索路径对应的存储项的编号为9,即待检索路径存储在节点0的第9项存储项中。进一步明确待检索路径的存储位置。
在步骤S304,获取存储项中各个路径组对应的指纹库。
其中,存储项包括至少一个路径组,每个路径组包括指纹库。因指纹库中记录了存储路径哈希值的特征,获取该存储项中的每个路径组的指纹库,即获取了存储项中的每个路径组中的各个存储路径对应的存储路径哈希值的特征。
在步骤S305,依据待检索哈希值、指纹库和存储路径哈希值,确定待检索路径对应的存储地址。
在一些实施方式中,依据待检索哈希值、指纹库和存储路径哈 希值,确定待检索路径对应的存储地址,包括:依据待检索哈希值和存储项中的路径组对应的指纹库确定候选存储路径;依据候选存储路径的哈希值和待检索哈希值,确定待检索路径对应的存储地址。
其中,依据待检索哈希值和指纹库确定的候选存储路径可以是一条存储路径,也可以是多条存储路径。
在一些实施方式中,候选存储路径可以采用并行的方式获得,也可以采用串行的方式获得。
例如,采用并行的方式获得候选存储路径。若存储项中包括3个路径组,每个路径组包括一个指纹库,则可先获取这3个路径组对应的各个指纹库,即指纹库1、指纹库2和指纹库3;然后,将待检索哈希值分别与这三个指纹库进行对比,确定每个路径组中是否包括候选存储路径,若确定指纹库2和指纹库3都包括候选存储路径,则可同时将指纹库2对应的候选存储路径1和指纹库3对应的候选存储路径2提取出来,以方便下一步骤的处理。即,将待检索哈希值分别与候选存储路径1和候选存储路径2进行对比,确定最终的待检索路径对应的存储地址。
例如,采用串行的方式获得候选存储路径。若存储项中包括3个路径组,每个路径组包括一个指纹库,则可依次对每个路径组的指纹库进行处理。例如,先对路径组1对应的指纹库1进行处理,确定该路径组1中是否存在候选存储路径,若不存在,则继续对路径组2对应的指纹库2进行处理;若存在,则提取路径组1中的候选存储路径1,再将该候选存储路径1的哈希值和待检索哈希值进行对比,确定待检索哈希值和该候选存储路径1的哈希值是否相同,在确定待检索哈希值和该候选存储路径1的哈希值相同的情况下,确定该候选存储路径1与待检索路径一致,确定获得待检索路径对应的存储地址;否则,在确定待检索哈希值和该候选存储路径1的哈希值不同的情况下,确定该候选存储路径1与待检索路径不一致。然后,继续对下一个路径组2进行查找,对路径组2的查找过程与对路径组1的处理过程相同,在此不再赘述;直至将存储项中的3个路径组都处理完毕。
通过对存储项中的路径组对应的指纹库进行初步筛选,可初步 确定待检索路径可能存在于哪个路径组,然后再提取候选存储路径的哈希值,对待检索哈希值进行再次处理,进行再次筛选,以确定待检索路径对应的存储地址。通过两级的哈希值处理,能够最大限度的避免哈希冲突,保证查找到待检索路径对应的存储地址。
在一些实施方式中,依据待检索哈希值和存储项中的路径组对应的指纹库确定候选存储路径,包括:对待检索哈希值和存储项中的路径组对应的指纹库做异或处理,确定候选存储路径。
其中,异或处理表示对两个数字进行对比,并根据对比结果确定异或结果,例如,若设定两个数字为A和B,若A和B不相同,则异或结果为真(可采用“1”表示);若A和B相同,则异或结果为假(可采用“0”表示)。待检索哈希值和指纹库均可采用二进制数字表示。
通过对待检索哈希值和存储项中的路径组对应的指纹库做异或处理,获得异或结果,若该异或结果为待检索哈希值的部分特征与该指纹库的部分特征相同,则表示该指纹库对应的路径组中包括候选存储路径。例如,指纹库中包括16个不同的存储路径的特征,若待检索哈希值的部分特征与该指纹库中的第4个存储路径的特征相同,则表示该指纹库中的第4个存储路径为候选存储路径。基于异或处理确定候选存储路径能够加快对存储路径的检索速度。
在一些实施方式中,依据候选存储路径的哈希值和待检索哈希值,确定待检索路径对应的存储地址,包括:对比待检索哈希值和候选存储路径的哈希值,确定待检索路径对应的存储地址。
若确定待检索哈希值和候选存储路径的哈希值相同,则表示该候选存储路径即为要查找的待检索路径,并获取该待检索路径对应的存储地址。若确定待检索哈希值和候选存储路径的哈希值不同,则表示该候选存储路径不是要查找的待检索路径,需要将待检索哈希值与其他候选存储路径进行比对,以确定待检索路径对应的存储地址。
通过将待检索哈希值和候选存储路径的哈希值进行对比,能够从候选存储路径中快速筛选出与待检索路径相同的存储路径,进而根据该存储路径确定待检索路径对应的存储地址,加快对待检索路径对 应的存储地址的检索速度。
在步骤S306,从待检索路径对应的存储地址获得元数据。
需要说明的是,步骤S306与参照图2所描述的步骤S203相同,在此不再赘述。
在本申请中,通过基于待检索路径确定待检索哈希值,待检索路径配置用于查找元数据的存储路径,方便对待检索路径进行处理;依据待检索哈希值和预设节点的数量,确定待检索节点,初步筛查待检索路径可能存储的待检索节点,避免哈希冲突;获取存储项中各个路径组对应的指纹库,依据待检索哈希值确定待检索路径对应的存储项,进一步对待检索哈希值进行处理,确定待检索路径对应的存储项,加快对元数据的检索速度;依据待检索哈希值、指纹库和存储路径哈希值,确定待检索路径对应的存储地址,其中,指纹库配置为记录存储路径哈希值的特征,存储路径哈希值是元数据的存储路径对应的哈希值,通过指纹库先对存储路径哈希值的特征进行筛选,能够加快检索速度,提高检索效率;从待检索路径对应的存储地址获得元数据,保证检索结果的准确性。
在一些实施方式中,路径组还包括位图信息;步骤S305中的依据待检索哈希值和存储项中的路径组对应的指纹库确定候选存储路径之后,所述方法还包括:依据位图信息,获取候选存储路径中的空闲路径,空闲路径与空闲存储地址相对应,空闲存储地址的存储状态为空闲状态。
其中,位图信息配置为标识路径组中是否存在空闲路径。例如,位图信息可采用二进制数字表示,每位二进制数字表示该路径组中的各个存储路径是否是空闲路径。例如,该位图信息采用一个16位的二进制数字表示,可表示该路径组中包括16个存储路径,若该位图信息的第5位是“0”,则表示该路径组中的第5个存储路径是空闲路径;若该位图信息的第8位是“1”,则表示该路径组中的第8个存储路径不是空闲路径,即该第8个存储路径对应的存储地址已经存储有元数据。
通过位图信息可快速准确的确定候选存储路径中的空闲路径, 可方便的查找到空闲存储地址,提升数据存储的效率;并且,通过位图信息能够清晰的确定存储项中的各个存储路径的存储状态,方便对各个存储路径的管理,提升路径的管理效率。
在一些实施方式中,依据位图信息,获取候选存储路径中的空闲路径,包括:在确定候选存储路径中存在空闲路径的情况下,获取空闲路径;在确定候选存储路径中不存在空闲路径的情况下,生成新的路径组,并从新的路径组中获取空闲路径。
其中,新的路径组可包括至少一个存储路径,并且该存储路径对应的存储地址中没有存储元数据,能够提升数据的存储容量。
通过在不同情况下,使用位图信息快速准确的查找到候选存储路径中的空闲路径的位置,方便后续使用该空闲路径对元数据进行存储,提升元数据的存储效率。
在一些实施方式中,依据位图信息,获取候选存储路径中的空闲路径之后,所述方法还包括:获取待存储元数据;将待存储元数据存储至空闲路径对应的空闲存储地址。
其中,待存储元数据是期望能够存储到持久内存文件系统中的元数据,将待存储元数据存储至空闲路径对应的空闲存储地址,保证待存储元数据的安全性;并且,在后续对该待存储元数据进行查找时,通过本申请各个实施方式中的检索方法,能够加快对待存储元数据的检索速度。
在一些实施方式中,存储项还包括路径组数量;将待存储元数据存储至空闲路径对应的空闲存储地址之后,所述方法还包括:更新存储项中的路径组数量、以及空闲存储地址对应的路径组中的位图信息和空闲存储地址对应的路径组中的指纹库。
需要说明的是,若空闲路径是通过生成新的路径组,并从新的路径组中获取到的空闲路径,则在将待存储元数据存储至空闲路径对应的空闲存储地址之后,还需要更新该新的路径组对应的存储项中的路径组数量,以及该新的路径组中的位图信息和指纹库,方便后续对该待存储元数据的查找。
若空闲路径是在确定候选存储路径中存在空闲路径的情况下, 获取的空闲路径,则在将待存储元数据存储至空闲路径对应的空闲存储地址之后,需要更新该候选存储路径对应的路径组中的位图信息和指纹库,无需对存储项中的路径组数量进行更新。
通过更新以上信息,可避免待存储元数据存储后无法查找的问题,能够通过更新后的位图信息和指纹库,快速定位到待存储元数据的存储路径。
图4示出本申请提供的持久内存文件系统元数据的检索方法的流程示意图。如图4所示,本申请提供的持久内存文件系统元数据的检索方法可以包括以下步骤S401至S405。
在步骤S401,基于待检索路径确定待检索哈希值。
在步骤S402,依据待检索哈希值、指纹库和存储路径哈希值,确定待检索路径对应的存储地址。
在步骤S403,从待检索路径对应的存储地址获得元数据。
需要说明的是,步骤S401~步骤S403与参照图2所描述的步骤S201~步骤S203相同,在此不再赘述。
在步骤S404,删除待检索路径对应的存储地址中的元数据及其对应的存储路径哈希值,获得删除后的路径组。
例如,若确定待检索路径为节点0的哈希表的第9项的第2个路径组的第3个存储路径,则需要将该第3个存储路径对应的存储路径哈希值删除,并删除该第3个存储路径对应的存储地址中的元数据,获得更新后的第2个路径组(即,删除第3个存储路径后的路径组)。
在步骤S405,更新删除后的路径组对应的指纹库和位图信息。
其中,指纹库和位图信息中均包括了存储路径的相关信息,在将某个存储路径中的元数据及其对应的存储路径哈希值删除后,需要对删除后的路径组对应的指纹库和位图信息做同步更新,避免存储信息出现错误。
例如,若待检索路径对应的路径组为节点0的哈希表的第9项的第2个路径组,则需更新该第2个路径组对应的位图和指纹库。若第2个路径组的位图为0xffdf(即,采用二进制数字表示为1111111111011111),则更新后的节点0的哈希表的第9项的第2 个路径组的位图为0xdfdf(即,采用二进制数字表示为1101111111011111)。同时,还需要删除该第2个路径组的指纹库的第3格(即,将该第2个路径组的指纹库的第32位至第47位置为“0”),获得更新后的节点0的哈希表的第9项的第2个路径组的指纹库。
在本申请中,通过指纹库先对存储路径进行筛选,快速锁定候选存储路径,能够加快检索速度,提高检索效率;从待检索路径对应的存储地址获得元数据,保证检索结果的准确性;当获取元数据之后,删除待检索路径对应的存储地址中的元数据及其对应的存储路径哈希值,获得删除后的路径组,避免不必要的元数据占用存储资源,提升存储资源的利用率;更新删除后的路径组对应的指纹库和位图信息,避免存储信息出现错误,保证后续对元数据进行处理时,能够快速准确的获得期望的元数据及其对应的存储路径。
图5示出本申请提供的持久内存文件系统元数据的检索方法的流程示意图。如图5所示,该持久内存文件系统元数据的检索方法可以包括以下步骤S501至S505。
在步骤S501,基于待检索路径确定待检索哈希值,待检索路径配置用于查找元数据的存储路径。
在步骤S502,依据待检索哈希值、指纹库和存储路径哈希值,确定待检索路径对应的存储地址。
在步骤S503,从待检索路径对应的存储地址获得元数据。
需要说明的是,步骤S501~步骤S503与参照图2所描述的步骤S201~步骤S203相同,在此不再赘述。
在步骤S504,获取待修改信息。
其中,待修改信息包括待修改的目录元数据和待修改的文件元数据。修改的目录元数据包括预设目录下的目录项的数量、每个目录项的名字和每个目录项的类型中的任意一种或几种。待修改的文件元数据包括但不限于:预设文件的最后修改时间、预设文件的文件尺寸、预设文件包括多个文件数据段,文件数据段的数量、各个文件数据段的起始地址及其长度。以上对于仅是举例说明,其他未说明的待修改信息也在本申请的保护范围之内,可根据具体情况具体设定,在此不 再赘述。
在步骤S505,依据待修改信息修改待检索路径对应的存储地址中的元数据。
其中,若某个目录下的预设文件被修改了,可根据该预设文件的最后修改时间待检索路径对应的存储地址中的元数据(例如,更新预设文件的最后修改时间等)。保证持久内存文件系统中存储的元数据是实时更新的,保证元数据的真实可靠。
在本申请中,通过指纹库先对存储路径哈希值的特征进行筛选,能够加快检索速度,提高检索效率;从待检索路径对应的存储地址获得元数据,保证检索结果的准确性;若获得的元数据需要进行修改,在获取该元数据对应的待修改信息之后,依据待修改信息修改待检索路径对应的存储地址中的元数据,保证元数据的真实可靠。
图6示出本申请提供的持久内存文件系统元数据的存储结构的组成方框图,如图6所示,该存储结构可应用于本申请中的持久内存文件系统元数据的检索方法,该存储结构包括如下数据位:存储地址数据位601,配置为记录存储地址,存储地址配置为存储元数据,存储地址与存储路径相对应;存储路径哈希值数据位602,配置为记录存储路径哈希值,存储路径哈希值是基于存储路径确定的哈希值,存储路径哈希值配置为记录存储路径的特征;指纹库数据位603,配置为记录指纹库,指纹库配置为记录存储路径哈希值的特征。
在一些实施方式中,存储结构,还包括:存储节点,存储节点包括存储项,存储项包括至少一个路径组,每个路径组包括指纹库数据位603。
例如,图7示出本申请中的各个节点的存储结构的组成方框图。如图7所示,示出了四个节点,每个节点都存储有一张哈希表(即,节点1的哈希表、节点2的哈希表、节点3的哈希表和节点4的哈希表),而每张哈希表中都存储有不同的存储项,例如,节点1的哈希表中存储有6个存储项,每个存储项所存储的元数据的数量也不同(例如,节点1的哈希表中的第1个存储项存储有3个元数据、第2个存储项存储有2个元数据、……、第6个存储项存储有1个元数据)。
不同节点的哈希表相互独立,能够避免节点之间的哈希冲突,并且,每个节点的哈希表中存储有多个不同的存储项,通过待检索哈希值可快速的查找到对应节点的存储项,提升元数据的检索速度。
在一些实施方式中,路径组还包括:位图信息数据位,配置为记录位图信息,位图信息配置为标识路径组中是否存在空闲路径。
通过位图信息先判定路径组中是否存在空闲路径,方便对待存储元数据进行存储,加快数据的存储速度,保证待存储元数据能够快速的存储到合适的存储路径。
在本申请中,通过存储地址数据位记录存储地址,该存储地址配置为存储元数据,存储地址与存储路径相对应,可方便对待检索路径进行处理;使用指纹库数据位记录指纹库,该指纹库配置为记录存储路径哈希值的特征,通过指纹库先对存储路径哈希值的特征进行筛选,获得候选存储路径,加快检索速度;使用存储路径哈希值数据位记录存储路径哈希值,该存储路径哈希值是基于存储路径确定的哈希值,存储路径哈希值配置为记录存储路径的特征,然后再对存储路径哈希值进行二次筛选,进一步提升检索速度,提高检索效率的同时,保证检索结果的准确性。
图8示出本申请提供的持久内存文件系统元数据的存储结构的组成方框图,如图8所示,该存储结构包括:路径组,该路径组包括路径组编号、位图、指纹库、多个存储路径哈希值及其对应的元数据。
其中,位图配置为标识路径组中是否存在空闲路径,指纹库配置为记录存储路径哈希值的特征,存储路径哈希值配置为记录存储路径的特征,存储路径哈希值是基于存储路径确定的哈希值,存储地址与存储路径相对应,存储地址配置为存储元数据。
例如,采用SHA-256算法对各个存储路径进行哈希运算,确定存储路径哈希值,该存储路径哈希值可表示为一个256位的二进制数字,为了便于阅读,可将该二进制的存储路径哈希值转化成十六进制的数字表示,每16位一格,共计16格。则在该路径组中可存储16组存储路径哈希值及其对应的元数据。如图8所示,16个存储路径哈希值及其存储的元数据可以包括:存储路径1哈希值和存储路径1 元数据、存储路径2哈希值和存储路径2元数据、……、存储路径16哈希值和存储路径16元数据。其中,存储路径1哈希值是基于存储路径1确定的哈希值;存储路径2哈希值是基于存储路径2确定的哈希值;……;存储路径16哈希值是基于存储路径16确定的哈希值。
其中,位图也可以采用16位二进制数字表示,每位二进制数字配置为表征路径组中的各个存储路径是否是空闲路径。例如,位图中的第x位表征路径组中第x个存储路径是否是空闲路径,即第x个存储路径对应的空闲存储地址的存储状态是否是空闲状态,“0”表示处于空闲状态,“1”表示存储状态,x为大于或等于0,且小于或等于15的整数。通过位图中存在的“1”的数量,可以快速求得路径组中实际存储的元数据的数量。若某些元数据被删除,则该被删除的元数据对应的位图中的二进制数字需要更新为“0”。指纹库也可以对应采用一个256位的二进制数字表示,每16位二进制数字为一组,配置为标识该路径组中的一个存储路径哈希值的特征,则该指纹库可存储有16个存储路径哈希值的特征。
需要说明的是,在对元数据进行检索时,通过指纹库和待检索哈希值确定的候选存储路径只有一条,则可直接将该候选存储路径对应的哈希值提取出来,进而获取该候选存储路径对应的元数据。若候选存储路径存在多条时,还需要逐个将候选存储路径对应的哈希值与待检索哈希值进行对比,以确定待检索哈希值对应的存储路径,进而获取该待检索哈希值对应的元数据。
在本申请中,通过路径组的结构,存储多个存储路径哈希值及其对应的元数据,能够通过位图信息快速确定路径组中的各个存储路径是否是空闲路径,在进行元数据的存储时,可快速定位到空闲路径,加快元数据的存储速度。使用指纹库来记录存储路径哈希值的特征,而存储路径哈希值配置为记录存储路径的特征,在对元数据进行检索时,先对指纹库进行筛选,获得候选存储路径,然后再针对候选存储路径,进一步定位待检索路径,能够提升检索速度,提高检索效率,同时,保证检索结果的准确性。
图9示出本申请提供的持久内存文件系统元数据的检索装置的 组成方框图,如图9所示,该装置包括:哈希值确定模块901,配置为基于待检索路径确定待检索哈希值,待检索路径配置用于查找元数据的存储路径;地址确定模块902,配置为依据待检索哈希值、指纹库和存储路径哈希值,确定待检索路径对应的存储地址,指纹库配置为记录存储路径哈希值的特征,存储路径哈希值是元数据的存储路径对应的哈希值;获取模块903,配置为从待检索路径对应的存储地址获得元数据。
在本申请中,通过哈希值确定模块基于待检索路径确定待检索哈希值,待检索路径配置用于查找元数据的存储路径,方便对待检索路径进行处理;使用地址确定模块依据待检索哈希值、指纹库和存储路径哈希值,确定待检索路径对应的存储地址,其中,指纹库配置为记录存储路径哈希值的特征,存储路径哈希值是元数据的存储路径对应的哈希值,通过待检索哈希值和指纹库对存储地址进行筛选,再利用待检索哈希值和存储路径哈希值确定存储地址,能够加快检索速度,提高检索效率,同时保证检索结果的准确性。
图10示出本申请提供的持久内存文件系统元数据的检索装置的组成方框图。如图10所示,该检索系统包括:路径模块1010、二级哈希索引模块1020和路径组模块1030。其中,路径组模块1030包括指纹库模块1031和元数据模块1032。
其中,路径模块1010,配置为获取存储路径,存储路径与存储地址相对应,存储地址配置为存储元数据。例如,存储路径为“/root/a/b”或“/root/a/out/a.out”,而存储地址是与“/root/a/b”或“/root/a/out/a.out”对应的在持久内存文件系统中的地址。由于存储路径的格式复杂且长度不同,不方便管理。若对该存储路径进行哈希处理,获得存储路径哈希值,该存储路径哈希值可以唯一标识存储路径,进而根据该存储路径哈希值可唯一的确定元数据的存储地址,方便对存储路径及其对应的元数据的管理,提升存储路径的管理效率。
二级哈希索引模块1020,配置为存储两个级别的存储格式,即,节点级存储格式和哈希表级存储格式。
例如,在节点级存储格式中,存储路径哈希值(hashUnique)中 的前32位二进制数字表示持久内存文件系统中的预设节点的数量(countNode)。例如,在进行元数据的检索的过程中,提取hashUnique的最前32位二进制数字,并将该32位二进制数字转换成十进制数字(data1),使用data1对countNode做取余处理,获得待检索节点,若该待检索节点在持久内存文件系统中,则可确定在该待检索节点中进行进一步的数据检索。
进一步地,在哈希表级存储格式中,使用hashUnique中的最后20位二进制数字表示存储项对应的编号(例如,第i个预设节点的哈希表中的第j项存储项,i和j均为大于或等于1的整数),其中,存储项是持久内存文件系统中的待检索节点的哈希表中的存储项。例如,在进行元数据的检索的过程中,提取hashUnique的最后20位二进制数字,并将该20位二进制数字转换成十进制数字(a),则a表示第i个预设节点的哈希表中的第a项存储项。
通过采用二级哈希索引模块对元数据进行存储,可避免哈希冲突的频繁发生,减少各个存储项之间的哈希冲突。
路径组模块1030,配置为通过指纹库模块1031获取指纹库中的各个数据位,并根据该指纹库和待检索哈希值,确定候选存储路径,其中的候选存储路径可以是一个或多个。进一步地,根据待检索哈希值和候选存储路径确定待检索哈希值,进而根据该待检索哈希值确定元数据的存储路径。使用元数据模块1032存储元数据。
其中,元数据包括目录元数据和文件元数据。目录元数据包括预设目录下的目录项的数量、每个目录项的名字和每个目录项的类型中的任意一种或几种。文件元数据包括但不限于:预设文件的最后修改时间、预设文件的文件尺寸、预设文件包括多个文件数据段,文件数据段的数量、各个文件数据段的起始地址及其长度。元数据可以是定长数据,也可以是变长数据。以上对于元数据仅是举例说明,可根据情况进行设定,其他未说明的元数据也在本申请的保护范围之内,在此不再赘述。
需要说明的是,每个预设节点都存储有一张哈希表,该哈希表包括N个路径组,N为大于或等于0的整数。每个路径组包括:路 径组编号、位图、指纹库、以及16个存储路径哈希值。在文件系统规模较小的情况下,哈希表中的路径组可能有0个或1个,而路径组中可能只有1个存储路径。在哈希冲突不可避免的情况下,采用路径组模块对元数据对应的存储路径进行管理,能够提高元数据的查找效率。
例如,图11示出本申请提供的指纹库的各个数据位的获取方法的流程示意图。指纹库模块1031可采用图11所示的获取方法获得该指纹库的各个数据位。
例如,采用SHA-256算法对各个存储路径进行哈希运算,确定存储路径哈希值(即,图11中的第1个存储路径哈希值、第2个存储路径哈希值、……、第16个存储路径哈希值),各个存储路径哈希值均是一个256位的二进制数字,为了便于阅读,可将该二进制的存储路径哈希值转化成十六进制的数字表示,每16位为一格,共计16格。提取第1个存储路径哈希值的第1格(即,第0-15位)作为指纹库的第1格(即,第0-15位),提取第2个存储路径哈希值的第2格(即,第16-31位)作为指纹库的第2格(即,第16-31位),……,提取第16个存储路径哈希值的第16格(即,第240-255位)作为指纹库的第16格(即,第240-255位),从而生成路径组的指纹库。
需要说明的是,在现有技术中,在持久内存文件系统中存储的元数据较多的情况下,若出现哈希冲突(例如,相同的存储路径哈希值对应到相同的哈希表中的不同存储项),则通过链表式的存储结构来索引存储路径的,对应的索引效率低下。而本申请中,通过二级哈希索引模块1020和路径组模块1030的配合管理,获得的存储结构(即,包括预设节点、每个节点存储有存储项,每个存储项中包括存储路径哈希值及其对应的元数据),能够提升对存储路径的检索效率,避免哈希冲突的频繁发生,从而高效的获取存储路径对应的元数据。
在本申请中,通过路径模块获取存储路径,使用二级哈希索引模块中的节点级存储格式和哈希表级存储格式,对元数据进行存储,可避免哈希冲突的频繁发生,减少各个存储项之间的哈希冲突。通过路径组模块中的指纹库模块获取指纹库中的各个数据位,并根据该指 纹库和待检索哈希值确定候选存储路径;根据待检索哈希值和候选存储路径确定待检索哈希值,进而根据该待检索哈希值确定元数据的存储路径,在哈希冲突不可避免的情况下,采用路径组模块所管理的存储结构,可提高元数据的查找效率。
图12示出本申请提供的持久内存文件系统的各个预设节点的哈希表的存储格式示意图。如图12所示,该持久内存文件系统包括4个预设节点,即节点0、节点1、节点2和节点3。每个节点都存储有一张哈希表,每张哈希表包括1048576(1024*1024)个存储项,每个存储项包括至少一个路径组。例如,节点4的第9个存储项包括3个路径组,即路径组1、路径组2和路径组3。该第9个存储项中存储有43个存储路径,例如,路径组1存储有16个不同存储路径中的元数据,路径组2存储有16个不同存储路径中的元数据,路径组3存储有11个不同存储路径中的元数据,而路径组3中的剩余的5个存储路径为空闲路径,即这5个空闲路径没有存储元数据。
图13示出本申请提供的持久内存文件系统的元数据的检索方法的流程示意图。如图13所示,在一个示例性实施例中,在持久内存文件系统中,该检索方法包括如下步骤S1310至S1350。
在步骤S1310,获取元数据的检索指令。
其中,检索指令包括待检索路径,存储在该待检索路径下的元数据可以是文件元数据,也可以目录元数据。例如,待检索路径为“/test/fs_for_install/dpfs-2.0/fs/a.out”。
在步骤S1320,使用路径模块1010,采用SHA256算法对待检索路径进行哈希运算,获得待检索路径哈希值。
其中,待检索路径哈希值可采用一个256位的二进制数字表示,例如,将该待检索路径哈希值标记为hashUnique。为了便于阅读,可将该256位的二进制数字转化成十六进制的数字。表1示出待检索路径哈希值的结构。如表1所示,待检索路径哈希值包括16个格,每个格代表16位二进制数字(或,与该16位二进制数字对应的4位16进制数字)。
表1 待检索路径哈希值的结构
ff3a | 11fc | ff32 | ff33 | ff34 | ff35 | ff36 | ff37 | ff38 | ff39 | ff3a | ff3b | ff3c | ff3d | ff10 | 0009 |
在步骤S1330,使用二级哈希索引模块1020对待检索哈希值进行分析,确定待检索路径位于第i个节点的哈希表中的第a项存储项中。
例如,因该持久内存文件系统包括4个预设节点,即预设节点的数量为4。使用二级哈希索引模块1020先获取hashUnique的最前32位的二进制数字(例如,0xff3a11fc),然后将该最前32位的二进制数字转换为十进制数字(4281995772),并使用该十进制数字(例如,4281995772)对预设节点的数量(例如,4)做取余处理,获得余数i(i等于0),即确定待检索路径位于节点0中。
进一步地,使用二级哈希索引模块1020获取hashUnique的最后20位的二进制数字(例如,0x0009),然后将该最后20位的二进制数字转换为十进制数字a(例如,a等于9),最后确定待检索路径位于该节点0的哈希表中的第9项存储项中。
需要说明的是,在现有技术中,如图1所示,是采用多个节点共用一张哈希表的存储方式进行元数据的存储,在进行元数据的检索的过程中,由于一张哈希表中的不同区段对应不同的节点,可能存在计算获得的待检索哈希值对应多条存储路径,进而对多条不同节点上的存储路径进行进一步检索时,会存在检索效率低下的问题,而本申请中,通过每个节点都存储有一张哈希表,避免了哈希冲突的频繁发生;并通过二级哈希索引模块对待检索哈希值进行处理,可提升对待检索路径的检索效率,高效的获得待检索路径对应的元数据。
在步骤S1340,将待检索路径哈希值与节点0的哈希表中的第9项存储项中的各路径组中的指纹库进行比对,确定待检索路径对应的存储地址。
在一个示例性实施例中,通过将hashUnique与节点0的哈希表中的第9项存储项中的各路径组中的指纹库进行比对,可获得候选存储路径,该候选存储路径可包括一条存储路径或多条存储路径;然后将该候选存储路径的哈希值与待检索哈希值进行对比,确定待检索路 径对应的存储地址。
例如,图14示出本申请提供的获取待检索路径对应的存储地址的方法的结构示意图。如图14所示,第一步骤中,某条路径组的指纹库与待检索哈希值进行对比,可确定第4格相同(即,该路径组中的第4个存储路径对应的存储路径哈希值的片段特征与待检索哈希值的第4个片段特征相同);然后,进入第二步骤,即提取第4个存储路径对应的存储路径哈希值,然后将该第4个存储路径对应的存储路径哈希值与待检索哈希值进行对比,可确定两个哈希值完全相同,则该第4个存储路径即为待检索路径。
通过指纹库对待检索哈希值进行初步筛选,确定相同的片段特征,再根据该相同的片段特征,对待检索哈希值做进一步的精细筛选,确定第4个存储路径为待检索路径,提升了对待检索路径的处理速度,加快了对元数据的检索效率。
返回图13,在步骤S1350,从待检索路径对应的存储地址获得元数据。
通过待检索路径对应的存储地址,可准确的确定元数据对应的存储位置,再从待检索路径对应的存储地址获得元数据,能够保证该元数据即为要查找的元数据,保证元数据的准确性。
在本申请中,通过二级哈希索引模块1020对待检索哈希值进行分析,确定待检索路径位于第i个节点的哈希表中的第a项存储项中,即对待检索哈希值进行精细化分析,能够明显减少哈希冲突,以十个节点为例,若采用本申请中的检索方法能够产生哈希冲突的概率为第一概率;而使用现有技术中的采用一张哈希表对多个节点中的元数据进行检索所产生的冲突哈希冲突的概率为第二概率,则第一概率仅为第二概率的10%,极大的降低了哈希冲突。将待检索路径哈希值与节点0的哈希表中的第9项存储项中的各路径组中的指纹库进行比对,确定待检索路径对应的存储地址,采用路径组的存储方式,可分步骤的缩小检索范围,提升对元数据的检索速度。
在一个示例性实施例中,步骤S1340可采用如下方式获得待检索路径对应的存储地址。图15示出本申请的获得待检索路径对应的 存储地址的方法的流程示意图。如图15所示,步骤S1340可以包括如下步骤S1341至S1346。
在步骤S1341,读取节点0的哈希表中的第9项中路径组的数量(GroupNum)。
在步骤S1342,提取路径组k中的指纹库,获得第k个指纹库。
其中,k为大于或等于0,且小于或等于2的整数。
例如,表2示出节点0的哈希表中的第9项中的路径组1对应的第1个指纹库的结构。表3示出节点0的哈希表的第9项中的路径组2对应的第2个指纹库的结构。
表2 节点0的哈希表中的第9项中的路径组1对应的第1个指纹库的结构
a415 | c811 | 6666 | 9988 | a415 | 7fd5 | 1f05 | 7dfd | 183d | 60f7 | 0f70 | ffff | c1ee | 87ff | 387f | 0a0f |
表3 节点0的哈希表的第9项中的路径组2对应的第2个指纹库的结构
6664 | 6665 | ff32 | 6667 | 6668 | 6669 | 666a | 666b | 666c | 666d | 0000 | 666f | 6670 | 6671 | 6672 | 6673 |
在步骤S1343,将第k个指纹库与待检索路径哈希值做按位异或运算,获得第k个异或结果。
其中,第k个异或结果可表示为256位二进制数字。
例如,表4示出第1个异或结果,表5示出第2个异或结果。
表4 第1个异或结果
5b2f | d9ed | 9954 | 66bb | 5b21 | 80e0 | e033 | 82ca | e705 | 9fce | f04a | 00c4 | 3ed2 | 78c2 | c76f | 0a06 |
表5 第2个异或结果
995E | 7799 | 0000 | 9954 | 995C | 995C | 995C | 995C | 9954 | 9954 | FF3A | 9954 | 994C | 994C | 9962 | 667A |
在步骤S1344,判断第k个异或结果的16个格中是否存在全0的数据位。
如表4所示,第1个异或结果中不存全0的数据位;如表5所示,第2个异或结果的第3格中的数据位为全0。
在确定第k个异或结果的16个格中存在全0的数据位的情况下,执行步骤S1345;在确定第k个异或结果的16个格中不存在全0的数据位的情况下,执行步骤S1342。
在步骤S1345,提取第k个异或结果中的全0的数据位对应的存 储路径哈希值,并比较该存储路径哈希值与待检索路径哈希值是否相同。
例如,表6示出节点0的哈希表的第9项第2个路径组的第3个存储路径对应的存储路径哈希值。
表6 节点0的哈希表的第9项的第2个路径组的第3个存储路径对应的存储路径哈希值
ff3a | 11fc | ff32 | ff33 | ff34 | ff35 | ff36 | ff37 | ff38 | ff39 | ff3a | ff3b | ff3c | ff3d | ff10 | 0009 |
将表6中的存储路径哈希值与表1中的待检索路径哈希值进行对比,在确定存储路径哈希值与待检索路径哈希值相同的情况下,执行步骤S1346;在确定存储路径哈希值与待检索路径哈希值不同的情况下,执行步骤S1342。
在步骤S1346,获取待检索路径对应的存储地址。
通过步骤S1345的比较,可确定待检索路径哈希值(即,表1所示的哈希值)与节点0的哈希表的第9项的第2个路径组的第3个存储路径对应的存储路径哈希值(即,表6所示的哈希值)完全一致,确定已查找到待检索路径,即待检索路径是节点0的哈希表的第9项的第2个路径组的第3个存储路径对应的存储地址。
在本申请中,通过对节点0的哈希表中的第9项中的各个路径组进行依次检索,使用每个路径组对应的指纹库对待检索哈希值进行初步筛查,然后再提取第k个异或结果中的全0的数据位对应的存储路径哈希值,对该存储路径哈希值进行精细比较,确定该存储路径哈希值与待检索路径哈希值是否相同,能够加快对元数据对应的存储路径的检索速度。
在一个示例性实施例中,从待检索路径对应的存储地址获得元数据之后,该检索方法还包括:获取待修改信息(例如,一个文件的最后访问时间为2019年1月1日12:01);依据该待修改信息修改待检索路径对应的存储地址中的元数据(例如,更新待检索路径对应的存储地址中的某个文件的最后访问时间,例如,更新节点0的哈希表的第9项第2个路径组的第3个存储路径对应的元数据)。
通过依据待修改信息,修改待检索路径对应的存储地址中的元数据,使得元数据能够被实时更新,保证用户获取到的元数据的准确 性。
图16示出本申请提供的在持久内存文件系统中删除元数据的方法的流程示意图。如图16所示,在一个示例性实施例中,在持久内存文件系统中删除元数据的方法包括如下步骤S1601至S1606。
在步骤S1601,获取元数据的检索指令。
在步骤S1602,使用路径模块1010,采用SHA256算法对待检索路径进行哈希运算,获得待检索路径哈希值。
在步骤S1603,使用二级哈希索引模块1020对待检索哈希值进行分析,确定待检索路径位于第i个节点的哈希表中的第a项存储项中。
在步骤S1604,将待检索路径哈希值与节点0的哈希表中的第9项存储项中的各路径组中的指纹库进行比对,确定待检索路径对应的存储地址。
需要说明的是,步骤S1601~步骤S1604,与参照图13所描述的步骤S1310~步骤S1340相同,在此不再赘述。
在步骤S1605,删除待检索路径对应的存储路径哈希值,以及该存储路径哈希值对应的元数据。
例如,若通过步骤S1604可确定待检索路径为节点0的哈希表的第9项的第2个路径组的第3个存储路径,则需要将该第3个存储路径对应的存储路径哈希值删除,并删除该第3个存储路径对应的存储地址中的元数据。
在步骤S1606,修改待检索路径对应的路径组的位图和指纹库。
例如,若待检索路径对应的路径组为节点0的哈希表的第9项的第2个路径组,则需修改该第2个路径组对应的位图和指纹库。
在一个示例性实施例中,可将该第2个路径组的位图中的第3位置为“0”,以表征节点0的哈希表的第9项的第2个路径组的第3个存储路径是空闲路径。例如,若第2个路径组的位图为0xffdf(即,采用二进制数字表示为1111111111011111),则修改后的节点0的哈希表的第9项的第2个路径组的位图为0xdfdf(即,采用二进制数字表示为1101111111011111)。
同时,删除指纹库的第3格(即,将该第2个路径组的指纹库的第32位至第47位置为“0”),获得更新后的节点0的哈希表的第9项的第2个路径组的指纹库。表7示出本申请的更新后的节点0的哈希表的第9项的第2个路径组的指纹库。
表7 更新后的节点0的哈希表的第9项的第2个路径组的指纹库
6664 | 6665 | 0000 | 6667 | 6668 | 6669 | 666a | 666b | 666c | 666d | 0000 | 666f | 6670 | 6671 | 6672 | 6673 |
在本申请中,通过采用SHA256算法对待检索路径进行哈希运算,获得待检索路径哈希值,可提高对待检索路径的处理效率;使用二级哈希索引模块对待检索哈希值进行分析,确定待检索路径位于第i个节点的哈希表中的第a项存储项中,能够减少哈希冲突;将待检索路径哈希值与节点0的哈希表中的第9项存储项中的各路径组中的指纹库进行比对,确定待检索路径对应的存储地址,可加快对元数据的检索速度,提高检索效率;删除待检索路径对应的存储地址中的元数据及其对应的存储路径哈希值,获得删除后的路径组,避免不必要的元数据占用存储资源,提升存储资源的利用率;修改待检索路径对应的路径组的位图和指纹库,避免存储信息出现错误,保证后续对元数据进行处理时,能够快速准确的获得期望的元数据及其对应的存储路径。
图17示出本申请提供的在持久内存文件系统中增加元数据的方法的流程示意图。如图17所示,在一个示例性实施例中,在持久内存文件系统中增加元数据的方法包括如下步骤S1701至S1707。
在步骤S1701,获取元数据的增加指令。
其中,增加指令包括待存储路径,存储在该待存储路径下的元数据可以是文件元数据,也可以目录元数据。例如,待存储路径为“/test/fs_for_install/dpfs-2.0/fs/dpfs/scripts/local”。
在步骤S1702,使用路径模块1010,采用SHA256算法对待存储路径进行哈希运算,获得待存储路径哈希值。
其中,待存储路径哈希值可采用一个256位的二进制数字表示,为了便于阅读,可将该256位的二进制数字转换成十六进制的数字,表8示出待存储路径哈希值的结构。如表8所示,待存储路径哈希值 包括16个格,每个格代表4位16进制数字。
表8 待存储哈希值的结构
ff2f | ff30 | ff32 | ff33 | ff34 | ff35 | ff36 | ff37 | ff38 | ff39 | ff3a | ff3b | ff3c | ff3d | ff30 | 0009 |
在步骤S1703,使用二级哈希索引模块1020对待存储哈希值进行分析,确定待存储路径位于第i个节点的哈希表中的第a项存储项中。
例如,设定持久内存文件系统包括4个预设节点,换言之,持久内存文件系统中的预设节点的数量为4。使用二级哈希索引模块1020先获取待存储哈希值的最前32位的二进制数字(例如,0xff2fff30),然后将该最前32位的二进制数字转换为十进制数字(4281335600),并使用该十进制数字(4281335600)对预设节点的数量(4)做取余处理,获得余数i(i等于0),由于余数i为0,可确定待存储路径位于节点0中。
进一步地,使用二级哈希索引模块1020获取待存储哈希值的最后20位的二进制数字(例如,0x0009),然后将该最后20位的二进制数字转换为十进制数字a(例如,a等于9),最后确定待存储路径位于该节点0的哈希表中的第9项存储项中。
需要说明的是,在现有技术中,如图1所示,采用多个节点共用一张哈希表的存储方式进行元数据的存储,在对元数据进行检索的过程中,由于一张哈希表中的不同区段对应不同的节点,可能存在计算获得的待检索哈希值对应多条存储路径,进而需要对不同节点上的多条存储路径作进一步的检索时,存在检索效率低下的问题。而本申请中,通过每个节点都存储一张哈希表,避免了哈希冲突的频繁发生;并通过二级哈希索引模块1020对待存储哈希值进行处理,提升待存储路径的检索效率,能够高效的获得待存储路径对应的元数据。
在步骤S1704,依据第a项存储项的各个路径组的位图信息,确定第i个节点的哈希表中的第a项存储项中是否存在空闲路径。
在确定第i个节点的哈希表中的第a项存储项中存在空闲路径的情况下,执行步骤S1705;在确定第i个节点的哈希表中的第a项存储项中不存在空闲路径的情况下,执行步骤S1706。
其中,第a项存储项的各个路径组的位图信息都不相同。
例如,第a项存储项的第1个路径组的位图信息为0xffff(即,采用二进制数字表示为1111111111111111),则表示第a项存储项的第1个路径组中没有空闲路径;第a项存储项的第2个路径组的位图信息为0xffdf,(即,采用二进制数字表示为11111111 1101 1111),则表示第a项存储项的第2个路径组中的第11个存储路径为空闲路径,则执行步骤S1706。
例如,通过遍历第a项存储项的各个路径组的位图信息,都没有发现空闲路径,确定需要执行步骤S1705。
在步骤S1705,在第a项存储项中新建一个路径组,将新建的路径组的第一个存储路径作为空闲路径,同时将第a项存储项的路径组数量增加1。
在步骤S1706,更新空闲路径对应的第a项存储项中的路径组的位图信息和指纹库。
例如,将第a项存储项的第2个路径组的位图信息由0xffdf更新为0xffff,即将第11位置“1”。同时,将第a项存储项的第2个路径组的指纹库的第11格(即,将待存储哈希值的第160-175位写入指纹库的第160-175位)。
需要说明的是,在现有技术中,是通过链表的方式组织同一哈希表的存储项中的存储路径的,其中,链表是一种物理存储单元上非连续、非顺序的存储结构,而使用该链表所存储的元数据之间的逻辑顺序是通过链表中的指针的链接次序实现的。当需要查找某个存储路径时,需要从头依次查找链表的每一项,才能获得最终需要的存储路径,查找效率低下。
而本申请中,通过在存储项中设置多个路径组,每个路径组都包括位图和指纹库,能够通过位图快速的获得该路径组的各个存储路径的存储情况(例如,某个存储路径是否是空闲路径等),并通过指纹库记录该路径组中的存储路径哈希值的特征,能够快速的查找到待检索路径对应的待检索哈希值,有效提升了检索速度。
在步骤S1707,将待存储路径对应的元数据存储至第a项存储项 的路径组中的空闲路径对应的空闲存储地址。
例如,将待存储路径对应的元数据存储至节点0的哈希表的第9项第2个路径组的第11个存储路径对应的空闲存储地址中。
在本申请中,通过位图信息确定某个节点的哈希表的某个存储项中是否存在空闲路径,能够快速定位到空闲路径的位置,加快对待存储元数据的存储速度;进一步地,将待存储元数据存储至空闲路径对应的空闲存储地址,更新空闲路径对应的第a项存储项中的路径组的位图信息和指纹库,能够避免将待存储元数据存储至持久内存文件系统中后,无法查找到该待存储元数据的问题;通过更新后的位图信息和指纹库,可快速定位到待存储元数据的存储路径,加快对待存储元数据的查找速度。
需要明确的是,本申请并不局限于上文中所描述并在图中示出的特定配置和处理。为了描述的方便和简洁,这里省略了对已知方法的详细描述,并且上述描述的系统、模块和单元的具体工作过程,可以参考前述方法实施方式中的对应过程,在此不再赘述。
图18示出能够实现根据本申请提供的持久内存文件系统元数据的检索方法和装置的计算设备的示例性硬件架构的结构图。
如图18所示,计算设备1800包括输入设备1801、输入接口1802、中央处理器1803、存储器1804、输出接口1805、以及输出设备1806。其中,输入接口1802、中央处理器1803、存储器1804、以及输出接口1805通过总线1807相互连接,输入设备1801和输出设备1806分别通过输入接口1802和输出接口1805与总线1807连接,进而与计算设备1800的其他组件连接。
示例性地,输入设备1801接收来自外部的输入信息,并通过输入接口1802将输入信息传送到中央处理器1803;中央处理器1803基于存储器1804中存储的计算机可执行指令对输入信息进行处理以生成输出信息,将输出信息临时或者永久地存储在存储器1804中,然后通过输出接口1805将输出信息传送到输出设备1806;输出设备1806将输出信息输出到计算设备1800的外部供用户使用。
在一个实施方式中,图18所示的计算设备可以被实现为一种电 子设备,该电子设备可以包括:存储器,被配置为存储程序;处理器,被配置为运行存储器中存储的程序,以执行本文描述的持久内存文件系统元数据的检索方法。
在一个实施方式中,图18所示的计算设备可以被实现为一种持久内存文件系统元数据的检索系统,该持久内存文件系统元数据的检索系统可以包括:存储器,被配置为存储程序;处理器,被配置为运行存储器中存储的程序,以执行本文描述的持久内存文件系统元数据的检索方法。
以上所述,仅为本申请的示例性实施方式而已,并非用于限定本申请的保护范围。一般来说,本申请的多种实施方式可以在硬件或专用电路、软件、逻辑或其任何组合中实现。例如,一些方面可以被实现在硬件中,而其它方面可以被实现在可以被控制器、微处理器或其它计算装置执行的固件或软件中,尽管本申请不限于此。
本申请的实施方式可以通过移动装置的数据处理器执行计算机程序指令来实现,例如在处理器实体中,或者通过硬件,或者通过软件和硬件的组合。计算机程序指令可以是汇编指令、指令集架构(ISA)指令、机器指令、机器相关指令、微代码、固件指令、状态设置数据、或者以一种或多种编程语言的任意组合编写的源代码或目标代码。
本申请附图中的任何逻辑流程的框图可以表示程序步骤,或者可以表示相互连接的逻辑电路、模块和功能,或者可以表示程序步骤与逻辑电路、模块和功能的组合。计算机程序可以存储在存储器上。存储器可以具有任何适合于本地技术环境的类型并且可以使用任何适合的数据存储技术实现,例如但不限于只读存储器(ROM)、随机访问存储器(RAM)、光存储器装置和系统(数码多功能光碟DVD或CD光盘)等。计算机可读介质可以包括非瞬时性存储介质。数据处理器可以是任何适合于本地技术环境的类型,例如但不限于通用计算机、专用计算机、微处理器、数字信号处理器(DSP)、专用集成电路(ASIC)、可编程逻辑器件(FGPA)以及基于多核处理器架构的处理器。
通过示范性和非限制性的示例,上文已提供了对本申请的示范 实施方式的详细描述。但结合附图和权利要求来考虑,对以上实施方式的多种修改和调整对本领域技术人员来说是显而易见的,但不偏离本申请的范围。因此,本申请的恰当范围将根据权利要求确定。
Claims (18)
- 一种持久内存文件系统元数据的检索方法,包括:基于待检索路径确定待检索哈希值,所述待检索路径配置用于查找元数据的存储路径;依据所述待检索哈希值、指纹库和存储路径哈希值,确定所述待检索路径对应的存储地址,所述指纹库配置为记录所述存储路径哈希值的特征,所述存储路径哈希值是所述元数据的存储路径对应的哈希值;从所述待检索路径对应的存储地址获得所述元数据。
- 根据权利要求1所述的方法,其中,所述基于待检索路径确定待检索哈希值之后,所述依据所述待检索哈希值、指纹库和存储路径哈希值,确定所述待检索路径对应的存储地址之前,所述方法还包括:依据所述待检索哈希值和预设节点的数量,确定待检索节点,所述待检索节点包括存储项,所述存储项包括至少一个路径组,所述路径组包括所述指纹库;依据所述待检索哈希值确定所述待检索路径对应的存储项;获取所述存储项中各个路径组对应的指纹库。
- 根据权利要求2所述的方法,其中,所述依据所述待检索哈希值和预设节点的数量,确定待检索节点,包括:基于所述待检索哈希值获得节点信息值;使用所述节点信息值对所述预设节点的数量做取余处理,确定所述待检索节点。
- 根据权利要求2所述的方法,其中,所述依据所述待检索哈希值、指纹库和存储路径哈希值,确定所述待检索路径对应的存储地址,包括:依据所述待检索哈希值和所述存储项中的路径组对应的指纹库确定候选存储路径;依据所述候选存储路径的哈希值和所述待检索哈希值,确定所述待检索路径对应的存储地址。
- 根据权利要求4所述的方法,其中,所述依据所述待检索哈希值和所述存储项中的路径组对应的指纹库确定候选存储路径,包括:对所述待检索哈希值和所述存储项中的路径组对应的指纹库做异或处理,确定所述候选存储路径。
- 根据权利要求4所述的方法,其中,所述依据所述候选存储路径的哈希值和所述待检索哈希值,确定所述待检索路径对应的存储地址,包括:对比所述待检索哈希值和所述候选存储路径的哈希值,确定所述待检索路径对应的存储地址。
- 根据权利要求4所述的方法,其中,所述路径组还包括位图信息;所述依据所述待检索哈希值和所述存储项中的路径组对应的指纹库确定候选存储路径之后,还包括:依据所述位图信息,获取所述候选存储路径中的空闲路径,所述空闲路径与空闲存储地址相对应,所述空闲存储地址的存储状态为空闲状态。
- 根据权利要求7所述的方法,其中,所述依据所述位图信息,获取所述候选存储路径中的空闲路径,包括:在确定所述候选存储路径中存在所述空闲路径的情况下,获取所述空闲路径;在确定所述候选存储路径中不存在所述空闲路径的情况下,生成新的路径组,并从所述新的路径组中获取所述空闲路径。
- 根据权利要求7所述的方法,其中,所述依据所述位图信息,获取所述候选存储路径中的空闲路径之后,所述方法还包括:获取待存储元数据;将所述待存储元数据存储至所述空闲路径对应的空闲存储地址。
- 根据权利要求9所述的方法,其中,所述存储项还包括路径组数量;所述将所述待存储元数据存储至所述空闲路径对应的空闲存储地址之后,所述方法还包括:更新所述存储项中的所述路径组数量,以及所述空闲存储地址对应的路径组中的位图信息和所述空闲存储地址对应的路径组中的指纹库。
- 根据权利要求7所述的方法,其中,所述从所述待检索路径对应的存储地址获得所述元数据之后,所述方法还包括:删除所述待检索路径对应的存储地址中的所述元数据及其对应的存储路径哈希值,获得删除后的路径组;更新所述删除后的路径组对应的指纹库和位图信息。
- 根据权利要求7所述的方法,其中,所述从所述待检索路径对应的存储地址获得所述元数据之后,所述方法还包括:获取待修改信息;依据所述待修改信息修改所述待检索路径对应的存储地址中的所述元数据。
- 一种持久内存文件系统元数据的检索装置,包括:哈希值确定模块,配置为基于待检索路径确定待检索哈希值,所述待检索路径配置用于查找元数据的存储路径;地址确定模块,配置为依据所述待检索哈希值、指纹库和存储 路径哈希值,确定所述待检索路径对应的存储地址,所述指纹库配置为记录所述存储路径哈希值的特征,所述存储路径哈希值是所述元数据的存储路径对应的哈希值;获取模块,配置为从所述待检索路径对应的存储地址获得所述元数据。
- 一种元数据的存储结构,其中,所述存储结构应用于如权利要求1-12中任一项所述的持久内存文件系统元数据的检索方法,所述存储结构包括:存储地址数据位,配置为记录存储地址,所述存储地址配置为存储元数据,所述存储地址与存储路径相对应;存储路径哈希值数据位,配置为记录存储路径哈希值,所述存储路径哈希值是基于所述存储路径确定的哈希值,所述存储路径哈希值配置为记录所述存储路径的特征;指纹库数据位,配置为记录指纹库,所述指纹库配置为记录所述存储路径哈希值的特征。
- 根据权利要求14所述的存储结构,还包括:存储节点,所述存储节点包括存储项,所述存储项包括至少一个路径组,每个所述路径组包括所述指纹库数据位。
- 根据权利要求15所述的存储结构,其中,所述路径组还包括:位图信息数据位,配置为记录位图信息,所述位图信息配置为标识所述路径组中是否存在空闲路径。
- 一种电子设备,包括:一个或多个处理器;存储器,其上存储有一个或多个程序,当所述一个或多个程序被所述一个或多个处理器执行时,使得所述一个或多个处理器实现如 权利要求1-12中任一项所述的持久内存文件系统元数据的检索方法。
- 一种可读存储介质,其中,所述可读存储介质存储有计算机程序,所述计算机程序被处理器执行时实现如权利要求1-12中任一项所述的持久内存文件系统元数据的检索方法。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110310068.XA CN113010477B (zh) | 2021-03-23 | 2021-03-23 | 持久内存文件系统元数据的检索方法和装置、存储结构 |
CN202110310068.X | 2021-03-23 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022199400A1 true WO2022199400A1 (zh) | 2022-09-29 |
Family
ID=76405640
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2022/080367 WO2022199400A1 (zh) | 2021-03-23 | 2022-03-11 | 持久内存文件系统元数据的检索方法和装置、存储结构 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN113010477B (zh) |
WO (1) | WO2022199400A1 (zh) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113010477B (zh) * | 2021-03-23 | 2023-09-12 | 中兴通讯股份有限公司 | 持久内存文件系统元数据的检索方法和装置、存储结构 |
CN113596098B (zh) * | 2021-07-01 | 2023-04-25 | 杭州迪普科技股份有限公司 | 会话检索方法、装置、设备及计算机可读存储介质 |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2013210698A (ja) * | 2012-03-30 | 2013-10-10 | Hitachi Solutions Ltd | ファイル検索システム及びプログラム |
CN107862064A (zh) * | 2017-11-16 | 2018-03-30 | 北京航空航天大学 | 一个基于nvm的高性能、可扩展的轻量级文件系统 |
CN109446160A (zh) * | 2018-11-06 | 2019-03-08 | 郑州云海信息技术有限公司 | 一种文件读取方法、系统、装置及计算机可读存储介质 |
CN111125049A (zh) * | 2019-12-24 | 2020-05-08 | 上海交通大学 | 基于rdma与非易失内存的分布式文件数据块读写方法及系统 |
CN111221776A (zh) * | 2019-12-30 | 2020-06-02 | 上海交通大学 | 面向非易失性内存的文件系统的实现方法、系统及介质 |
CN113010477A (zh) * | 2021-03-23 | 2021-06-22 | 中兴通讯股份有限公司 | 持久内存文件系统元数据的检索方法和装置、存储结构 |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102682116B (zh) * | 2012-05-14 | 2014-06-11 | 中兴通讯股份有限公司 | 基于哈希表的表项处理方法及其装置 |
-
2021
- 2021-03-23 CN CN202110310068.XA patent/CN113010477B/zh active Active
-
2022
- 2022-03-11 WO PCT/CN2022/080367 patent/WO2022199400A1/zh active Application Filing
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2013210698A (ja) * | 2012-03-30 | 2013-10-10 | Hitachi Solutions Ltd | ファイル検索システム及びプログラム |
CN107862064A (zh) * | 2017-11-16 | 2018-03-30 | 北京航空航天大学 | 一个基于nvm的高性能、可扩展的轻量级文件系统 |
CN109446160A (zh) * | 2018-11-06 | 2019-03-08 | 郑州云海信息技术有限公司 | 一种文件读取方法、系统、装置及计算机可读存储介质 |
CN111125049A (zh) * | 2019-12-24 | 2020-05-08 | 上海交通大学 | 基于rdma与非易失内存的分布式文件数据块读写方法及系统 |
CN111221776A (zh) * | 2019-12-30 | 2020-06-02 | 上海交通大学 | 面向非易失性内存的文件系统的实现方法、系统及介质 |
CN113010477A (zh) * | 2021-03-23 | 2021-06-22 | 中兴通讯股份有限公司 | 持久内存文件系统元数据的检索方法和装置、存储结构 |
Also Published As
Publication number | Publication date |
---|---|
CN113010477A (zh) | 2021-06-22 |
CN113010477B (zh) | 2023-09-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10642515B2 (en) | Data storage method, electronic device, and computer non-volatile storage medium | |
US11899641B2 (en) | Trie-based indices for databases | |
WO2022199400A1 (zh) | 持久内存文件系统元数据的检索方法和装置、存储结构 | |
US10140351B2 (en) | Method and apparatus for processing database data in distributed database system | |
JP6427592B2 (ja) | データ型に関連するデータプロファイリング操作の管理 | |
US10671586B2 (en) | Optimal sort key compression and index rebuilding | |
US11429566B2 (en) | Approach for a controllable trade-off between cost and availability of indexed data in a cloud log aggregation solution such as splunk or sumo | |
US20140244654A1 (en) | Data migration | |
CN112912870B (zh) | 租户标识符的转换 | |
CN108205571B (zh) | 键值数据表的连接方法及装置 | |
US20180275961A1 (en) | Method and system for fast data comparison using accelerated and incrementally synchronized cyclic data traversal algorithm | |
US20230385308A1 (en) | Conversion and migration of key-value store to relational model | |
US10642789B2 (en) | Extended attribute storage | |
CN111125216A (zh) | 数据导入Phoenix的方法及装置 | |
US11500889B1 (en) | Dynamic script generation for distributed query execution and aggregation | |
CN113495901B (zh) | 一种面向可变长数据块的快速检索方法 | |
US12013861B2 (en) | Method and apparatus for retrieving and enumerating object metadata in distributed storage system | |
US10997144B2 (en) | Reducing write amplification in buffer trees | |
Chen et al. | Electronic evidence service research in cloud computing environment | |
US20220405160A1 (en) | Anomaly detection from log messages | |
CN110968267A (zh) | 数据管理方法、装置、服务器及系统 | |
US11645231B1 (en) | Data indexing for distributed query execution and aggregation | |
US20190114323A1 (en) | System And Method For Storing Data Records In Key-Value Database | |
CN114676289A (zh) | 前缀树的处理方法、装置、终端及存储介质 | |
CN117171272A (zh) | 数据同步方法及装置 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22774062 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205 DATED 23/01/2024) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 22774062 Country of ref document: EP Kind code of ref document: A1 |