CN111427862B - Metadata management method for distributed file system in power grid dispatching control system - Google Patents

Metadata management method for distributed file system in power grid dispatching control system Download PDF

Info

Publication number
CN111427862B
CN111427862B CN202010196756.3A CN202010196756A CN111427862B CN 111427862 B CN111427862 B CN 111427862B CN 202010196756 A CN202010196756 A CN 202010196756A CN 111427862 B CN111427862 B CN 111427862B
Authority
CN
China
Prior art keywords
file
data
basic attribute
file name
class data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010196756.3A
Other languages
Chinese (zh)
Other versions
CN111427862A (en
Inventor
雷宝龙
张凯
郭海龙
葛以踊
陈鹏
万书鹏
管荑
彭晖
翟明玉
陆居福
孙卫芳
李慧聪
张强
耿玉杰
马强
刘彤
易强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Corp of China SGCC
State Grid Shandong Electric Power Co Ltd
NARI Group Corp
Nari Technology Co Ltd
NARI Nanjing Control System Co Ltd
State Grid Electric Power Research Institute
Original Assignee
State Grid Corp of China SGCC
State Grid Shandong Electric Power Co Ltd
NARI Group Corp
Nari Technology Co Ltd
NARI Nanjing Control System Co Ltd
State Grid Electric Power Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Corp of China SGCC, State Grid Shandong Electric Power Co Ltd, NARI Group Corp, Nari Technology Co Ltd, NARI Nanjing Control System Co Ltd, State Grid Electric Power Research Institute filed Critical State Grid Corp of China SGCC
Priority to CN202010196756.3A priority Critical patent/CN111427862B/en
Publication of CN111427862A publication Critical patent/CN111427862A/en
Application granted granted Critical
Publication of CN111427862B publication Critical patent/CN111427862B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/14Details of searching files based on file metadata
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/14Details of searching files based on file metadata
    • G06F16/148File search processing
    • G06F16/152File search processing using file content signatures, e.g. hash values

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Library & Information Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a metadata management method of a distributed file system in a power grid dispatching control system, which divides the metadata of the distributed file system into a plurality of types of data such as basic attributes, file names and the like, adopts a linked list to manage all files and subdirectories under a directory, maintains the hierarchical relationship of the file system, and simultaneously realizes that the basic attribute data of the files can be quickly positioned through a complete path of the file names by establishing a mapping relationship between the file names and the basic attributes of the files. When accessing the file, the searching process can be completed under O (1) complexity, and the file metadata access performance is improved. Therefore, the distributed file access performance in the power grid dispatching control system can be improved, the time consumed by data access and excessive occupation of metadata on the memory are reduced, the file access real-time requirement of the power grid dispatching control system is met, and the method and the system are well suitable for the use environment with limited memory resources.

Description

Metadata management method for distributed file system in power grid dispatching control system
Technical Field
The invention relates to the technical field of data processing in a power grid dispatching control system, in particular to a metadata flattening management method for a distributed file system in the power grid dispatching control system.
Background
The distributed file system is a shared file system which connects scattered storage nodes to form a capacity far exceeding that of a single storage node, allows file data and storage space stored on a plurality of nodes to be shared through a network, greatly improves storage capacity and file access throughput, and has high reliability and elastic expansion capacity.
A distributed file system is adopted in the power grid dispatching control automation system to store graphic files, power grid section files, report files, operation mode data and the like. The distributed file system mainly comprises a metadata management part and a file data management part, wherein the metadata management part mainly manages basic attributes such as file names, creation time, file sizes and the like, and information such as file storage positions, file hierarchical relations and the like, the metadata is system data for describing the characteristics of a file, and an access entry of the whole distributed file system.
The existing open source distributed file system HDFS adopts a hierarchical relation mode to manage metadata information, each file or directory occupies a metadata storage unit, each file storage unit comprises basic attributes such as file name, file size, creation time and modification time, the file name is stored by adopting fixed-length arrays, each directory storage unit comprises an ordered array, and all files and directory metadata under the directory are stored. In this metadata management manner, when searching for a file, it is necessary to search for a file one by one from a root directory, and each level uses a binary search algorithm to find a corresponding subdirectory until a target file or directory is found, as shown in fig. 1, it is assumed that the time complexity is O (log 2 n), and when the file position is m levels, the total time complexity is mxo (log 2 n).
The power grid dispatching control system has very high requirements on the real-time performance and the stability of the system, but the conventional open-source distributed file system HDFS completely stores all metadata of the whole file system in a memory, obviously does not consider the application scene of the power grid dispatching control system, and has the following defects that: firstly, the time for searching files step by step is long, particularly when the files are located in deep layers and the number of the files is large, the metadata search occupies a large amount of time, the system concurrency is reduced, and the file access delay time is prolonged; secondly, all metadata are completely stored in the memory, and when the number of the system files is large, a large number of memory resources are occupied, and the operation of other functional modules in the system is influenced.
Disclosure of Invention
The invention aims to provide a metadata flattening management method for a distributed file system in a power grid dispatching control system, which reduces the time consumption of data access and excessive occupation of metadata on a memory and improves the real-time performance and reliability of the power grid dispatching control system.
The technical scheme adopted by the invention is as follows: a metadata management method for a distributed file system in a power grid dispatching control system comprises the following steps:
dividing metadata into a plurality of types of data at least comprising a file name type and a basic attribute type, respectively storing each type of data in a plurality of data blocks, and establishing file mapping in a memory; the file name class data comprises complete path information of each file or directory, and the basic attribute class data comprises basic attribute information of each file or directory;
for various types of data, managing the data blocks by adopting a linked list pool mode respectively; the file name class data information and the basic attribute class data information of a single file or a single directory are mutually corresponding in a mapping relation linked list;
responding to a received external metadata file creating request, acquiring file path information in the request, searching a linked list sequence number of a parent directory of a file in basic attribute data according to the acquired file path information, storing the basic attribute data of a metadata file to be created in the basic attribute data under the corresponding linked list sequence number, creating complete path data of the metadata file in file name data, and mapping and storing a storage position of the metadata file in the file name data and a storage position in the basic attribute data to a mapping relation linked list;
responding to the received external metadata file access request, and acquiring a file name in the external metadata file access request; and determining corresponding basic attribute information according to the mapping relation between the file name class data and the basic attribute class data corresponding to the file name in the mapping relation linked list, and further acquiring the corresponding basic attribute class data.
Optionally, creating the metadata file includes:
acquiring file path information in the request data, and determining a parent directory of a metadata file to be created;
calculating the file name ID1 of the father directory according to the file name length and the content of the father directory and a preset file name ID algorithm;
hashing the parent directory file name ID1 to obtain a hash position POS1 of the file name ID1 in the file name data;
searching a hash position POS2 in the basic attribute class data corresponding to the POS1 from the mapping relation linked list;
acquiring basic attribute information of a father directory and a corresponding chain table serial number H1 through a POS2;
creating a file complete path character string with the position of a POS3 linked list serial number H2 in the file name class data;
creating file basic attribute class data with the position of POS4 under the chain sequence H1 in the basic attribute class data;
calculating a file name ID2 corresponding to the created file, and hashing to obtain a hash position POS5;
and storing < ID2, POS5> as the mapping relation of the file in the file name class data and the basic attribute class data into a mapping relation linked list.
The preset file name ID algorithm may be a cyclic redundancy check algorithm or an MD5 algorithm, etc.
Optionally, the metadata is divided into four types of data, namely basic attribute, file name, data block and storage node, each type of data is divided into a plurality of data blocks according to a fixed size, and the data blocks are mapped into the memory from the disk file by adopting a file mapping mode to form a storage space;
the basic attribute class data comprises file length, creation time, modification time, type, data block ID, file name length and file name storage position information;
the data block type data comprises data block ID, length, check code and storage position information;
the storage node class data includes a storage node name, an IP address, rack information, resource hardware configuration information, and resource utilization information. The resource hardware comprises a CPU, a memory, a disk and the like.
Optionally, the resource manager performs resource allocation and recovery in a linked list pool manner for each type of data. Namely, the resource manager manages each storage unit in a linked list pool mode, acquires one unit from a linked list head during application, and puts the storage unit into the linked list during recovery. Therefore, different linked lists can be established, and operations such as data insertion, deletion, query and the like can be performed in each linked list.
Optionally, creating the metadata file further includes:
in data block type data resource management, calculating data blocks IDn-IDm to be distributed according to file length and data block length;
and in the storage node type data resource manager, distributing storage nodes for each data block according to the copy number of the data block.
Optionally, in the basic attribute class data, for a single directory, the basic attribute information of all files and the basic attribute information of the subdirectories are uniformly managed by using a linked list, and the head information of the linked list is stored in the file length information in the basic attribute of the directory.
Optionally, for the basic attribute class data stored as N data blocks, all data block IDs of the basic attribute class data are linked by a linked list, and the head information of the linked list is stored in the N-1 th data block ID information. Support for large file data block ID management may be achieved.
Optionally, in the file name class data, the complete path information of each file or directory is stored in at least two storage slices with different sizes according to the length of the name. The file name class data has long and short lengths, if 256 bytes of space are completely used for storage, a part of space is wasted, for example, storage slices are divided into 128 and 256 different lengths, file names smaller than 128 bytes are stored in one slice, and file names larger than 128 bytes are stored in the other slice, so that the storage space can be saved.
On the other hand, the invention also provides a metadata management device of the distributed file system in the power grid dispatching control system, in the device configuration, metadata is divided into a plurality of types of data at least comprising a file name class and a basic attribute class, each type of data is respectively stored in a plurality of data blocks, and file mapping is established in a memory; the file name class data comprises complete path information of each file or directory, and the basic attribute class data comprises basic attribute information of each file or directory;
for various types of data, managing the data blocks by adopting a linked list pool mode respectively; the file name class data information and the basic attribute class data information of a single file or a single directory are mutually corresponding in a mapping relation linked list;
the metadata management apparatus includes:
the metadata file creation module is configured for responding to a received external metadata file creation request, acquiring file path information therein, searching a linked list serial number of a parent directory of a file in the basic attribute class data according to the acquired file path information, storing the basic attribute class data of a metadata file to be created in the basic attribute class data under the corresponding linked list serial number, creating complete path data of the metadata file in the file name class data, and mapping and storing a storage position of the metadata file in the file name class data and a storage position in the basic attribute class data to a mapping relation linked list;
the metadata file access module is configured for responding to the received external metadata file access request and acquiring a file name in the external metadata file access request; and determining corresponding basic attribute information according to the mapping relation between the file name class data corresponding to the file name in the mapping relation linked list and the basic attribute class data, and further acquiring the corresponding basic attribute class data.
Advantageous effects
The distributed file system metadata is divided into four types of data including basic attributes, file names, data blocks and storage nodes, all files and subdirectories under each directory are managed by adopting a linked list, the hierarchical relationship of the file system is kept, the mapping relationship between the file names and the basic attributes of the files is established, the basic attribute data of the files are quickly positioned through the complete path of the file names, the distributed file access performance in the power grid dispatching control system can be improved, the requirement on real-time file access of the power grid dispatching control system is met, and the distributed file system metadata can better adapt to the use environment with limited memory resources.
Drawings
FIG. 1 is a diagram illustrating a metadata management architecture of a conventional distributed file system.
Fig. 2 is a schematic diagram of a metadata management architecture of a distributed file system in the power grid scheduling control system according to the present invention.
Detailed Description
The following further description is made in conjunction with the accompanying drawings and the specific embodiments.
Example 1
The embodiment is a metadata management method for a distributed file system in a power grid scheduling control system, and the method comprises the following steps:
dividing metadata into a plurality of types of data at least comprising a file name type and a basic attribute type, respectively storing each type of data in a plurality of data blocks, and establishing file mapping in a memory; the file name class data comprises complete path information of each file or directory, and the basic attribute class data comprises basic attribute information of each file or directory;
for various types of data, managing the data blocks by adopting a linked list pool mode respectively; the file name class data information and the basic attribute class data information of a single file or a single directory are mutually corresponding in a mapping relation linked list;
responding to a received external metadata file creating request, acquiring file path information in the request, searching a linked list sequence number of a parent directory of a file in basic attribute data according to the acquired file path information, storing the basic attribute data of a metadata file to be created in the basic attribute data under the corresponding linked list sequence number, creating complete path data of the metadata file in file name data, and mapping and storing a storage position of the metadata file in the file name data and a storage position in the basic attribute data to a mapping relation linked list;
responding to the received external metadata file access request, and acquiring a file name in the external metadata file access request; and determining corresponding basic attribute information according to the mapping relation between the file name class data and the basic attribute class data corresponding to the file name in the mapping relation linked list, and further acquiring the corresponding basic attribute class data.
In the embodiment, metadata is divided into four types of data including basic attributes, file names, data blocks and storage nodes, each type of data is divided into a plurality of data blocks according to a fixed size, and the data blocks are mapped into a memory from a disk file by adopting a file mapping mode to form a storage space;
the basic attribute class data comprises file length, creation time, modification time, type, data block ID, file name length and file name storage position information;
the data block type data comprises data block ID, length, check code and storage position information;
the storage node class data includes a storage node name, an IP address, rack information, resource hardware configuration information, and resource utilization information.
And the resource manager respectively allocates and recovers the resources in a linked list pool mode. Namely, the resource manager manages each storage unit in a linked list pool mode, acquires one unit from a linked list head during application, and puts the storage unit into the linked list during recovery. Therefore, different linked lists can be established, and operations such as data insertion, deletion, query and the like can be performed in each linked list.
In the basic attribute class data, for a single directory, the basic attribute information of all files and the basic attribute information of subdirectories are uniformly managed by adopting a linked list, and the head information of the linked list is stored in the file length information in the basic attribute of the directory.
For the basic attribute class data stored as N data blocks, all data block IDs of the basic attribute class data are linked through a linked list, and the head information of the linked list is stored in the ID information of the (N-1) th data block. Support for large file data block ID management can be achieved.
In the file name data, the complete path information of each file or directory is stored in at least two storage slices with different sizes according to the length of the name. The file name class data has long and short lengths, if 256 bytes of space are completely used for storage, a part of space is wasted, for example, storage slices are divided into 128 and 256 different lengths, file names smaller than 128 bytes are stored in one slice, and file names larger than 128 bytes are stored in the other slice, so that the storage space can be saved.
Specifically, creating the metadata file includes:
acquiring file path information in the request data, and determining a parent directory of a metadata file to be created;
calculating a file name ID1 of the parent directory according to the file name length and the content of the parent directory and a preset file name ID algorithm;
hashing the parent directory file name ID1 to obtain a hash position POS1 of the file name ID1 in the file name data;
searching a hash position POS2 in the basic attribute class data corresponding to the POS1 from the mapping relation linked list;
acquiring basic attribute information of a father directory and a corresponding chain table serial number H1 through a POS2;
creating a file complete path character string with the position of a POS3 linked list serial number H2 in the file name class data;
creating file basic attribute class data with the position of POS4 under the chain sequence H1 in the basic attribute class data;
calculating a file name ID2 corresponding to the created file, and hashing to obtain a hash position POS5;
and storing < ID2, POS5> as the mapping relation of the file in the file name class data and the basic attribute class data into a mapping relation linked list.
Creating the metadata file further comprises:
in data block type data resource management, calculating data blocks IDn-IDm to be distributed according to file length and data block length;
and in the storage node type data resource manager, distributing storage nodes for each data block according to the copy number of the data block.
The preset file name ID algorithm may be a cyclic redundancy check algorithm or an MD5 algorithm, etc.
Example 2
This embodiment specifically illustrates a method for managing metadata of a distributed file system in a power grid scheduling control system according to the present invention, which includes the following steps:
1. dividing the file meta information into four types of data, namely basic attribute, file name, data block and storage node, as shown in fig. 2:
the basic attribute class data comprises information such as file length, creation time, modification time, type, data block ID, file name length, file name storage position and the like;
storing the complete path of each file and directory in the file name data;
the data block type data comprises information such as data block ID, length, check code, storage position and the like;
the storage node data includes storage node name, IP address, rack information, resource hardware configuration information such as CPU, memory, and disk, and resource utilization information.
Among the various types of data divided:
establishing a storage space for each type of data by adopting a block file mapping mode; dividing each type of data into different data blocks according to a fixed size, and mapping the data blocks from a disk file to an internal memory by adopting a file mapping mode to form a storage space;
each type of data is managed by adopting a uniform resource manager and is responsible for allocating and recovering resources; the resource manager manages each storage unit in a linked list pool mode, acquires one unit from a linked list head during application, and puts the storage unit into the linked list during recovery;
managing each type of data in a linked list pool mode; the resource management mode adopted by the resource manager is as follows: for each type of data, different linked lists can be established through a resource manager, and data can be inserted, deleted and inquired in each linked list;
each type of data supports direct positioning according to the sequence number; the sequence number is the location information of the data unit, which is determined when allocating resources, and the data unit can be directly located through the information;
when each type of data is distributed with data, the data which is recently recycled is preferentially distributed; data refers to four types of data that make up metadata;
each type of data can dynamically adjust occupied memory resources according to configuration parameters. The configuration parameters are the configuration parameters read by the starting program from the configuration file; these parameters configure the minimum and maximum memory space that each type of data can occupy.
2. Adding file name length, file name storage position information and data block ID information in basic attribute data; because the file name class data in the basic attribute is independently stored and managed, the basic attribute does not directly contain the file name, the length of the file name and the storage position information of the file name need to be increased in the basic attribute, and the real storage position of the file name can be obtained through the length and the position. Similarly, the data block ID is used to obtain information such as the length and storage location of the data block.
Basic attribute class data:
the basic attributes of all files and the basic attributes of subdirectories under each directory are uniformly managed by a linked list;
for each directory, storing the chain table header information of the basic attributes of the subfiles and the subdirectories by using the file size items in the basic attribute information;
the basic attribute is defaulted to reserve N data blocks, the basic data is a data structure with a fixed size, when a file is large, the data blocks are also variable, the fixed data structure cannot meet the storage requirement when the data block IDs are large, at the moment, the chain table is adopted to manage the excessive data block IDs, and the N-1 st data block ID stores a chain table head.
3. Storing the whole file name absolute path, and establishing a mapping relation between the file name absolute path and the file basic attribute, wherein the mapping relation comprises the following contents:
dividing the file into a plurality of storage slices according to the absolute path length of the file name; for example, the storage slices are divided into 128 and 256 storage slices with different lengths, file names smaller than 128 bytes are stored in one slice, and file names larger than 128 bytes are stored in the other slice, so that a lot of memory space can be saved;
calculating and generating a file name ID through algorithms such as cyclic redundancy check codes, md5 and the like according to the file name content and the length;
hashing the index number of the file according to the file name ID, and managing the ID of the conflict by adopting a linked list;
when accessing the file metadata, the file metadata is directly positioned through the mapping relation between the file name and the file index number.
4. The data block manager is responsible for the allocation and recovery of data block resources, an idle resource linked list is arranged in the resource manager, all idle resources are on the linked list, when the resources are allocated, a resource unit is obtained from the idle linked list, and when the resources are recovered, the idle unit is hung on the linked list.
The data block information comprises data version information; the data block allocation and recovery function meets idempotency; the data block manager can assign a data block specifying a block ID.
5. The storage node manager supports multiple selection strategies when allocating storage nodes. When a data block is allocated, a corresponding storage node, that is, a physical node where the data block is stored, needs to be allocated. Each file corresponds to one or more data blocks, and each data block corresponds to a plurality of storage nodes.
The invention can support the comprehensive consideration of the utilization conditions of the resources such as the free disk space of the storage node, the CPU load, the memory resource, the network load and the like, and select the proper storage node.
Because each type of data is loaded into the memory in a file mapping mode, and all data units are managed through the linked list, data distribution and recovery can be completed quickly, direct positioning can be performed according to the sequence number, dirty data write-back can also be completed quickly, and quick persistence of metadata is realized. When the memory resources of the operation nodes are limited, the size of the metadata mapped to the memory can be dynamically adjusted according to the configuration, so that the method is suitable for more complex application scenes.
When a file is created, the file index number of a parent directory of the file needs to be searched first to obtain the serial number of a management linked list of the parent directory, and then the file index number is created under the linked list, so that the hierarchical relationship between the files is maintained; selecting file name storage fragments according to the file name length, and writing file name storage index numbers into file basic attributes after storing the file names; distributing data blocks according to the file length, and selecting data storage nodes according to the running condition of the data storage nodes and a storage node selection strategy; and finally completing the file metadata creating process.
For example, create a file/dir/file.dat with bytes of 100MB, the creation process is:
1) Searching the serial number of the management linked list of the parent directory/dir;
a) Calculating a file directory ID1 of the father directory/dir, and hashing the ID1 through a hashing function to obtain a hash position POS1 of the ID1;
b) Finding a basic attribute position POS2 corresponding to the ID1 from a mapping relation management linked list positioned in the POS1;
c) In a basic attribute resource manager, obtaining a basic attribute of the dir through POS2 so as to obtain a management chain table sequence number H1 of a parent directory;
2) Creating file/dir/file.dat metadata;
a) Allocating a file name storage unit with a chain table head of H2 and a position of POS3 from a file name manager, and storing a character string '/dir/file.dat';
b) In data block resource management, calculating data blocks IDn-IDm to be distributed according to file length and data block length;
c) In a storage node manager, distributing storage nodes for each data block according to the number of copies of the data block;
d) In the basic attribute manager, a basic attribute storage unit which is allocated as POS4 is allocated in a linked list H1, basic attributes of the storage/dir/file.dat comprise file size, creation time and the like, a file name storage chain head H2 and a position POS3, the number of data blocks N, and the data blocks IDn-data blocks IDn.
3) Create a mapping relationship between file/dir/file
a) Calculating ID2 of the file name/dir/file.dat, and hashing the ID1 through a hash function to obtain a hash position POS5 of the ID1;
b) And finding a mapping relation management linked list positioned in the POS5, and inserting data < ID2, POS5>.
When a file is accessed, a file name ID is generated according to the file name length and the file name completion content, then the mapping relation is used for searching the file index number, all attribute information of the file can be obtained, the searching process is completed under the O (1) complexity, the file metadata access performance is improved, particularly under the application scene that the file names are uniformly and normally used, the hash collision rate of the file name ID is extremely low through a proper algorithm, and the method has very good query performance.
Example 3
Based on the same inventive concept as embodiments 1 and 2, this embodiment introduces a metadata management apparatus for a distributed file system in a power grid scheduling control system.
In the configuration of the device, metadata is divided into a plurality of types of data at least comprising a file name type and a basic attribute type, each type of data is respectively stored in a plurality of data blocks, and file mapping is established in a memory; the file name class data comprises complete path information of each file or directory, and the basic attribute class data comprises basic attribute information of each file or directory;
for various types of data, managing the data blocks by adopting a linked list pool mode respectively; the file name class data information and the basic attribute class data information of a single file or a single directory are mutually corresponding in a mapping relation linked list;
the metadata management apparatus includes:
the metadata file creating module is configured to respond to a received external metadata file creating request, acquire file path information therein, search a linked list sequence number of a parent directory of a file in the basic attribute class data according to the acquired file path information, store the basic attribute class data of a metadata file to be created in the basic attribute class data under the corresponding linked list sequence number, create complete path data of the metadata file in the file name class data, and map and store a storage position of the metadata file in the file name class data and a storage position in the basic attribute class data to a mapping relation linked list;
the metadata file access module is configured for responding to the received external metadata file access request and acquiring a file name in the external metadata file access request; and determining corresponding basic attribute information according to the mapping relation between the file name class data and the basic attribute class data corresponding to the file name in the mapping relation linked list, and further acquiring the corresponding basic attribute class data.
The function implementation of the metadata file creation module and the metadata file access module described above refers to a specific implementation manner of the corresponding function in embodiment 2.
In the embodiment, the metadata of the distributed file system is divided into four types of data, namely, basic attributes, file names, data blocks and storage nodes, a linked list is adopted to manage all files and subdirectories under each directory, the hierarchical relationship of the file system is kept, and the mapping relationship between the file names and the basic attributes of the files is established, so that the basic attribute data of the files can be quickly positioned through the complete path of the file names, the access performance of the distributed files in the power grid dispatching control system can be improved, the requirement on the real-time file access of the power grid dispatching control system is met, and the use environment with limited memory resources is better adapted.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While the present invention has been described with reference to the embodiments shown in the drawings, the present invention is not limited to the embodiments, which are illustrative and not restrictive, and it will be apparent to those skilled in the art that various changes and modifications can be made therein without departing from the spirit and scope of the invention as defined in the appended claims.

Claims (9)

1. A metadata management method for a distributed file system in a power grid dispatching control system is characterized by comprising the following steps:
dividing metadata into a plurality of types of data at least comprising a file name type and a basic attribute type, respectively storing each type of data in a plurality of data blocks, and establishing file mapping in a memory; the file name class data comprises complete path information of each file or directory, and the basic attribute class data comprises basic attribute information of each file or directory;
for various types of data, managing the data blocks by adopting a linked list pool mode respectively; the file name class data information and the basic attribute class data information of a single file or a single directory are mutually corresponding in a mapping relation linked list;
responding to a received external metadata file creation request, acquiring file path information therein, searching a linked list serial number of a parent directory of a file in basic attribute class data according to the acquired file path information, storing the basic attribute class data of a metadata file to be created in the basic attribute class data under the corresponding linked list serial number, creating complete path data of the metadata file in file name class data, and mapping and storing a storage position of the metadata file in the file name class data and a storage position in the basic attribute class data to a mapping relation linked list;
in response to receiving an external metadata file access request, acquiring a file name in the external metadata file access request; determining corresponding basic attribute information according to the mapping relation between the file name class data corresponding to the file name in the mapping relation linked list and the basic attribute class data, and further acquiring corresponding basic attribute class data;
creating the metadata file includes:
acquiring file path information in the request data, and determining a parent directory of a metadata file to be created;
calculating the file name ID1 of the father directory according to the file name length and the content of the father directory and a preset file name ID algorithm;
hashing the father directory file name ID1 to obtain a hash position POS1 of the file name ID1 in the file name data;
searching a hash position POS2 in the basic attribute class data corresponding to the POS1 from the mapping relation linked list;
acquiring basic attribute information of a father directory and a corresponding chain table serial number H1 through a POS2;
creating a file complete path character string with the position of a POS3 linked list serial number H2 in the file name class data;
creating file basic attribute class data with the position of POS4 under the chain table sequence number H1 in the basic attribute class data;
calculating a file name ID2 corresponding to the created file, and hashing to obtain a hash position POS5;
and storing < ID2, POS5> as the mapping relation of the file in the file name class data and the basic attribute class data into a mapping relation linked list.
2. The method of claim 1, wherein the predetermined file name ID algorithm is a cyclic redundancy check algorithm or an MD5 algorithm.
3. The method as claimed in claim 1, wherein the metadata is divided into four types of data including basic attribute, file name, data block and storage node, each type of data is divided into a plurality of data blocks according to fixed size, and the data blocks are mapped into the memory from the disk file by adopting a file mapping mode to form a storage space;
the basic attribute class data comprises file length, creation time, modification time, type, data block ID, file name length and file name storage position information;
the data block type data comprises data block ID, length, check code and storage position information;
the storage node class data includes a storage node name, an IP address, rack information, resource hardware configuration information, and resource utilization information.
4. The method as claimed in claim 3, wherein the resource manager allocates and recycles the resources in a linked list pool manner for each type of data.
5. The method of claim 4, wherein creating a metadata file further comprises:
in data block type data resource management, calculating data blocks IDn-IDm to be distributed according to file length and data block length;
and in the storage node type data resource manager, distributing storage nodes for each data block according to the copy number of the data block.
6. The method according to claim 1 or 3, wherein in the basic attribute class data, for a single directory, the basic attribute information of all files and the basic attribute information of subdirectories are uniformly managed by a linked list, and the head information of the linked list is stored in the file length information in the basic attribute of the directory.
7. A method as claimed in claim 1 or 3, wherein for the basic attribute class data stored as N data blocks, all data block IDs of the basic attribute class data are linked by a linked list, and the linked list header information is stored in the N-1 th data block ID information.
8. A method according to claim 1 or 3, wherein complete path information for each file or directory in the file name class data is stored in at least two different size storage slices, according to name size.
9. A metadata management device of a distributed file system in a power grid dispatching control system is characterized in that:
the metadata is divided into a plurality of types of data at least comprising a file name class and a basic attribute class, each type of data is respectively stored in a plurality of data blocks, and file mapping is established in a memory; the file name class data comprises complete path information of each file or directory, and the basic attribute class data comprises basic attribute information of each file or directory;
for various types of data, managing the data blocks by adopting a linked list pool mode respectively; the file name class data information and the basic attribute class data information of a single file or a single directory are mutually corresponding in a mapping relation linked list;
the metadata management apparatus includes:
the metadata file creating module is configured to respond to a received external metadata file creating request, acquire file path information therein, search a linked list sequence number of a parent directory of a file in the basic attribute class data according to the acquired file path information, store the basic attribute class data of a metadata file to be created in the basic attribute class data under the corresponding linked list sequence number, create complete path data of the metadata file in the file name class data, and map and store a storage position of the metadata file in the file name class data and a storage position in the basic attribute class data to a mapping relation linked list;
the metadata file access module is configured for responding to the received external metadata file access request and acquiring the file name of the external metadata file access request; determining corresponding basic attribute information according to the mapping relation between the file name class data and the basic attribute class data corresponding to the file name in the mapping relation linked list, and further acquiring corresponding basic attribute class data;
creating the metadata file includes:
acquiring file path information in the request data, and determining a parent directory of a metadata file to be created;
calculating a file name ID1 of the parent directory according to the file name length and the content of the parent directory and a preset file name ID algorithm;
hashing the parent directory file name ID1 to obtain a hash position POS1 of the file name ID1 in the file name data;
searching a hash position POS2 in the basic attribute class data corresponding to the POS1 from the mapping relation linked list;
acquiring basic attribute information of a father directory and a corresponding chain table serial number H1 through a POS2;
creating a file complete path character string with the position of the POS3 linked list serial number H2 in the file name class data;
creating file basic attribute class data with the position of POS4 under the chain table sequence number H1 in the basic attribute class data;
calculating a file name ID2 corresponding to the created file, and hashing to obtain a hash position POS5;
and storing < ID2, POS5> as the mapping relation of the file in the file name class data and the basic attribute class data into a mapping relation linked list.
CN202010196756.3A 2020-03-19 2020-03-19 Metadata management method for distributed file system in power grid dispatching control system Active CN111427862B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010196756.3A CN111427862B (en) 2020-03-19 2020-03-19 Metadata management method for distributed file system in power grid dispatching control system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010196756.3A CN111427862B (en) 2020-03-19 2020-03-19 Metadata management method for distributed file system in power grid dispatching control system

Publications (2)

Publication Number Publication Date
CN111427862A CN111427862A (en) 2020-07-17
CN111427862B true CN111427862B (en) 2022-11-04

Family

ID=71553569

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010196756.3A Active CN111427862B (en) 2020-03-19 2020-03-19 Metadata management method for distributed file system in power grid dispatching control system

Country Status (1)

Country Link
CN (1) CN111427862B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103282899A (en) * 2011-12-23 2013-09-04 华为技术有限公司 File system data storage method and access method and device therefor
CN104123359A (en) * 2014-07-17 2014-10-29 江苏省邮电规划设计院有限责任公司 Resource management method of distributed object storage system
CN106960011A (en) * 2017-02-28 2017-07-18 无锡紫光存储系统有限公司 Metadata of distributed type file system management system and method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103282899A (en) * 2011-12-23 2013-09-04 华为技术有限公司 File system data storage method and access method and device therefor
CN104123359A (en) * 2014-07-17 2014-10-29 江苏省邮电规划设计院有限责任公司 Resource management method of distributed object storage system
CN106960011A (en) * 2017-02-28 2017-07-18 无锡紫光存储系统有限公司 Metadata of distributed type file system management system and method

Also Published As

Publication number Publication date
CN111427862A (en) 2020-07-17

Similar Documents

Publication Publication Date Title
CN101556557B (en) Object file organization method based on object storage device
CN106294190B (en) Storage space management method and device
US7146377B2 (en) Storage system having partitioned migratable metadata
KR101994021B1 (en) File manipulation method and apparatus
US11157445B2 (en) Indexing implementing method and system in file storage
CN110209490B (en) Memory management method and related equipment
CN103229173A (en) Metadata management method and system
CN107562757B (en) Query and access method, device and system based on distributed file system
CN114860163B (en) Storage system, memory management method and management node
US9355121B1 (en) Segregating data and metadata in a file system
US20100306288A1 (en) Rebalancing operation using a solid state memory device
CN103186350A (en) Hybrid storage system and hot spot data block migration method
JP2015512551A (en) A consistent ring namespace that facilitates data storage and organization in network infrastructure
JP2015512604A (en) Cryptographic hash database
CN111881107B (en) Distributed storage method supporting mounting of multi-file system
CN114546295B (en) Intelligent writing distribution method and device based on ZNS solid state disk
CN1845093A (en) Attribute extensible object file system
US11625192B2 (en) Peer storage compute sharing using memory buffer
US10515055B2 (en) Mapping logical identifiers using multiple identifier spaces
US8239427B2 (en) Disk layout method for object-based storage devices
CN114610680A (en) Method, device and equipment for managing metadata of distributed file system and storage medium
CN102724301B (en) Cloud database system and method and equipment for reading and writing cloud data
CN111427862B (en) Metadata management method for distributed file system in power grid dispatching control system
CN111966742A (en) Data migration method and system
CN109902033B (en) LBA (logical Block addressing) distribution method and mapping method of namespace applied to NVMe SSD (network video management entity) controller

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant