WO2017206754A1 - Storage method and storage device for distributed file system - Google Patents

Storage method and storage device for distributed file system Download PDF

Info

Publication number
WO2017206754A1
WO2017206754A1 PCT/CN2017/085338 CN2017085338W WO2017206754A1 WO 2017206754 A1 WO2017206754 A1 WO 2017206754A1 CN 2017085338 W CN2017085338 W CN 2017085338W WO 2017206754 A1 WO2017206754 A1 WO 2017206754A1
Authority
WO
WIPO (PCT)
Prior art keywords
distributed file
file system
data
distributed
file data
Prior art date
Application number
PCT/CN2017/085338
Other languages
French (fr)
Chinese (zh)
Inventor
郑跃杰
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2017206754A1 publication Critical patent/WO2017206754A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • G06F16/184Distributed file systems implemented as replicated file system
    • G06F16/1844Management specifically adapted to replicated file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database

Definitions

  • the present disclosure relates to the field of communication technologies, for example, to a storage method and storage device of a distributed file system.
  • Network storage devices are used in products such as CRBT and Wireless Application Protocol (WAP) gateways.
  • WAP Wireless Application Protocol
  • the price of network storage devices has also increased exponentially, and the cost of these devices in the entire system often exceeds 50%.
  • Google's Google File System (GFS) is its own file storage technology, and GFS has reduced the cost of the system.
  • Distributed File System is a general-purpose storage software platform that runs on general-purpose hardware and provides storage platform support for products that need storage services. It provides massive data services (such as multimedia content storage and services) generated by storage, query retrieval and management products. Data storage, etc.). Building a distributed file system on a cheap general-purpose hardware platform has become an inevitable trend in the development of various storage-type businesses.
  • the distributed file system in the distributed file system platform adopts a distributed asymmetric software architecture.
  • the distributed file system implements the separation of functions and features, and is used to complete data access, access and management functions.
  • the distributed file system platform has clear modules and clear interfaces, making it easy to develop or use other products and projects based on this platform.
  • IOPS Input/Output Operation Per Second
  • a storage method and storage device for a distributed file system can uniformly manage a distributed file system, fully utilize the storage performance of the distributed file system, improve storage utilization, and improve user experience.
  • a storage method for a distributed file system, applied to a server comprising:
  • the configuration information including a policy attribute of storing file data, a mount point path of the distributed file system, and an interface of the distributed file system to communicate with the server;
  • the aggregating configuration information of all distributed file systems includes:
  • the policy attributes of the distributed file system, the mount point path of the distributed file system, and the interface of the distributed file system are included, the distributed file system including the existing distributed file System and new distributed file system;
  • the configuration information of all distributed systems is aggregated.
  • the method further includes:
  • the interface of the distributed file system corresponding to the file data is invoked according to the mapping relationship between the path of the file data and the mount point path of the distributed file system, and The interface reads the file data in the corresponding distributed file system.
  • the method further includes: when receiving the write request of the file data, encrypting the file data that needs to be written;
  • the method also includes decrypting the encrypted file data that needs to be read when a read request for the file data is received.
  • the method further includes:
  • the interface of the other distributed file system is invoked to remotely back up file data written to the corresponding distributed file system to the other distributed file system.
  • the method further includes:
  • the interface of the distributed file system corresponding to the file data is invoked according to the mapping relationship between the path of the file data and the mount point path of the distributed file system, from the corresponding The file data is deleted from the distributed file system;
  • the file data has a remote backup in the other distributed system, the file data is deleted from the other distributed file system of the remote backup.
  • a storage device for a distributed file system, deployed on a server comprising:
  • a system extensibility module configured to aggregate configuration information of all distributed file systems, where the configuration information includes a policy attribute for storing file data, a mount point path of the distributed file system, and an interface for the distributed file system to communicate with the server;
  • a policy selector configured to: when receiving a write request for file data, find a policy attribute that matches the file data according to a preset storage policy;
  • a data distribution module configured to invoke an interface of a distributed file system corresponding to a policy attribute matched by the file data, by using the interface to write the file data into the corresponding distributed file system, and recording file data The mapping between the path and the mount point path of the distributed file system.
  • system extensibility module is set to:
  • the policy attributes of the distributed file system, the mount point path of the distributed file system, and the interface of the distributed file system are included, the distributed file system including the existing distributed file System and new distributed file system;
  • the configuration information of all distributed systems is aggregated.
  • the data distribution module is further configured to:
  • the device further includes:
  • a data encryption and decryption module configured to encrypt file data to be written when a write request for file data is received; and to perform encrypted read file data to be read when receiving a read request for file data Decrypt.
  • the data distribution module is further configured to:
  • the interface of the other distributed file system is invoked to remotely back up file data written to the corresponding distributed file system to the other distributed file system.
  • the data distribution module is further configured to:
  • the interface of the distributed file system corresponding to the file data is invoked according to the mapping relationship between the path of the file data and the mount point path of the distributed file system, from the corresponding The file data is deleted from the distributed file system;
  • the file data has a remote backup in the other distributed system, the file data is deleted from the other distributed file system of the remote backup.
  • a computer readable storage medium storing computer executable instructions arranged to perform the above method.
  • a server that includes:
  • At least one processor At least one processor
  • the memory stores instructions executable by the at least one processor, the instructions being executed by the at least one processor to cause the at least one processor to perform the method described above.
  • the distributed file system storage method and the storage device perform unified operation on different distributed file systems by performing aggregation operation on the configuration information of the distributed file system, and perform file data writing, reading, and deleting; pre-configured storage
  • the policy and the writing of the file data according to the storage policy improve the communication storage utilization of the storage server; the file data can be encrypted, so that the file data stored in the distributed file system is invisible to the underlying system, thereby ensuring the file data. safety; You can remotely back up file data to implement remote backup and disaster recovery and ensure the reliability of important file data. Therefore, the distributed file system storage method and storage device fully utilize the distributed file system read and write performance and improve the user experience.
  • FIG. 1 is a schematic structural diagram of a storage device of a distributed file system in an embodiment
  • FIG. 2 is a schematic flowchart of a method for storing a distributed file system in an embodiment
  • FIG. 3 is a schematic diagram showing the steps of a distributed file system storage method in an embodiment
  • FIG. 4 is a schematic diagram showing the hardware structure of a server in an embodiment.
  • FIG. 1 is a schematic structural diagram of a storage device of a distributed file system in the embodiment.
  • the device can be deployed on a server.
  • the file storage is implemented by uniformly scheduling at least two distributed file systems, at least two storage domains, and at least two namespace file systems, so as to achieve high scalability and high availability between at least two distributed file systems.
  • the storage device of the distributed file system of this embodiment may include: a system extensibility module, a policy selector, and a data distribution module.
  • the system extensibility module is configured to aggregate configuration information of all distributed file systems, the configuration information including policy attributes for storing file data, a mount point path of the distributed file system, and an interface for the distributed file system to communicate with the server.
  • the policy attributes of the stored file data may include distribution of large files and small files, cold data and Storage strategy such as distribution of hot data and number of copies.
  • the large file may be a file with a file capacity greater than 1 MB, and the small file may be a file with a file capacity less than or equal to 1 MB; the hot data may be data frequently used by the user, and the cold data may be data with a low frequency of use by the user.
  • the system extensibility module is configured to configure a policy attribute of the distributed file system, a mount point path of the distributed file system, and an interface of the distributed file system when the distributed file system is detected, the distribution File system includes the existing distributed file system and the new distributed file system; and after the configuration of the distributed file system is completed, the configuration information of all distributed systems is aggregated, and the data distribution module is notified for data distribution.
  • the module is uniformly scheduled.
  • the policy selector is configured to, when receiving a write request of the file data, find a policy attribute that matches the file data according to a preset storage policy.
  • the preset storage policy may be selecting a distributed file system according to the size of the file data or the type of the file data.
  • small file data is stored in a distributed file system with high IOPS
  • large file data is stored in a distributed file system with high bandwidth, thereby fully utilizing the storage performance of the storage system.
  • temporary file data can be stored in a single-copy distributed file system, reducing the impact of temporary file data on file system storage performance; important file data can be stored in multiple copies of a distributed file system, or Remote backup and disaster recovery through data distribution modules ensures the reliability of important file data.
  • the data distribution module is configured to invoke a policy attribute matching the file data corresponding to an interface of the distributed file system, by using the interface to write the file data into the corresponding distributed file system, and recording path information of the file data and The mapping relationship of the mount point path information of the distributed file system, thereby facilitating subsequent file data distribution.
  • the interface that communicates with the data distribution module follows a unified Portable Operating System Interface (POSIX), which defines an interface standard provided by the operating system for the application.
  • POSIX Portable Operating System Interface
  • the data distribution module can call at least two interfaces of the distributed file system to read and write file data, record the path information of the file data, and mount the distributed file system. Point mapping of path information to facilitate subsequent file data segmentation hair.
  • the data distribution module can also back up important file data to other distributed file systems synchronously or asynchronously, thereby realizing remote backup of important file data.
  • the data distribution module may be further configured to: when receiving the deletion request of the file data, deleting the mapping from the corresponding distributed file system according to the mapping relationship between the path of the file data and the mount point path of the distributed file system File data; and if the file data has a remote backup in the other distributed system, the file data is deleted from the remotely backed up distributed file system.
  • the storage device of the distributed file system includes a data encryption and decryption module, and the data encryption and decryption module is configured to encrypt the file data to be written when receiving the write request of the file data; and when the file is received When the data is read, the encrypted file data that needs to be read is decrypted.
  • the data encryption and decryption module provides an encryption and decryption algorithm, and when receiving the write request of the file data, encrypts the file data that needs to be written, and when receiving the read request of the file data, the pair needs to read out
  • the encrypted file data is decrypted, so that the file data stored in the distributed file system is invisible to the underlying system, thereby ensuring the security of the file data.
  • the basic configuration of the hardware may be as follows: a central processing unit (CPU) and a solid state drive (SSD).
  • the CPU is an Intel E5-2620, which includes 6 cores, 12 threads, and a CPU clocked at 2500 MHz or higher.
  • the mapping relationship between the path information of the SSD record file data and the mount point path information of the distributed file system, the memory of the SSD is greater than 96 Gbytes (Byte).
  • this embodiment provides a storage method of a distributed file system.
  • step 210 configuration information of all distributed file systems is aggregated, the configuration information including policy attributes for storing file data, a mount point path of the distributed file system, and an interface for the distributed file system to communicate with the server.
  • step 220 when a write request for file data is received, a policy attribute matching the file data is searched according to a preset storage policy.
  • step 230 the interface of the distributed file system corresponding to the policy attribute matched with the received file data is invoked, and the received file data is written into the corresponding distributed file through the interface.
  • the mapping relationship between the path of the file data and the mount point path of the distributed file system is recorded.
  • this embodiment provides a distributed file system storage method.
  • the system extensibility module aggregates configuration information of all distributed file systems, the configuration information including policy attributes for storing file data, a mount point path of the distributed file system, and an interface for the distributed file system to communicate with the server. .
  • the system extensibility module can configure each distributed file system, including configuring the mount point path, policy attributes, and interfaces for each distributed file system.
  • the system extensibility module aggregates configuration information of all distributed file systems. When a new distributed file system is found, the configuration information of the newly added distributed file system can be aggregated, and multiple distributed file systems can be configured. There are multiple replica redundancy policies, and the distributed distributed file system can satisfy the POSIX semantic specification.
  • the data distribution module is notified after the configuration is completed, so that it is uniformly scheduled by the data distribution module.
  • step 320 when a write request for file data is received, the file data to be written is encrypted.
  • step 320 when the user process receives the write request of the file data, the data encryption and decryption module determines whether the file data needs to be encrypted, for example, the file data is private data, and the data encryption and decryption module can The data is encrypted.
  • step 330 a policy attribute matching the file data is searched according to a preset storage policy.
  • the pre-set storage policy may be to select a distributed file system according to file data size or file data type or the like.
  • the policy selector can select a corresponding distributed file system for the file data according to a preset storage policy.
  • step 340 the interface of the distributed file system corresponding to the file data is called, and the file data is read and written through the interface.
  • the data distribution module may call an interface of the distributed file system corresponding to the policy attribute matched by the file data, and write the file data into the corresponding distributed file system, for example, write the file data.
  • the data distribution module may call an interface of the distributed file system corresponding to the policy attribute matched by the file data, and write the file data into the corresponding distributed file system, for example, write the file data.
  • the distributed file system 1.
  • step 350 the file data is remotely backed up to other distributed file systems.
  • the data distribution module can asynchronously distribute the file data to other distributed file systems, such as remote backup to the distributed file system 2, to ensure important file data, as needed, for example, the file data is an important file. reliability.
  • File data is backed up remotely to other distributed file systems.
  • step 360 a mapping relationship between the path of the file data and the mount point path of the distributed file system is recorded.
  • the data distribution module may also record the mapping relationship between the path of the file data and the mount point path of the distributed file system to facilitate subsequent file data distribution.
  • step 370 when receiving the read request of the file data, according to the mapping relationship between the path of the file data and the mount point path of the distributed file system, the interface of the distributed file system corresponding to the file data is invoked, and File data is read from the corresponding distributed file system through the interface.
  • step 370 when the user process receives the read request of the file data, the data distribution module searches for the mapping relationship between the path of the previously recorded file data and the mount point path of the distributed file system, and determines the corresponding distributed file system. And reading file data from the corresponding distributed file system.
  • step 380 the file data that needs to be read is decrypted.
  • step 380 if the file data to be read is encrypted data, the data encryption and decryption module decrypts the file data.
  • step 390 the file data is read.
  • step 390 the file data that needs to be read is returned to the user process.
  • the distributed file system storage method may further include:
  • the data distribution module searches for the mapping relationship between the path of the previously recorded file data and the mount point path of the distributed file system, and determines the corresponding distributed file system, the call and the file. An interface of the distributed file system corresponding to the data, the file data is deleted from the corresponding distributed file system;
  • the data distribution module needs to Determining, according to configuration information of all distributed file systems of the aggregation, other distributed file systems other than the corresponding distributed file system written by the file data; calling the interfaces of the other distributed file systems, and remotely backing up The file data is deleted in other distributed file systems.
  • the storage method and the storage device of the distributed file system of the embodiment perform unified operation on at least two distributed file systems by performing aggregation operation on the configuration information of the distributed file system, and perform file data writing and reading.
  • Delete pre-configure the storage policy and write the file data according to the storage policy, which improves the storage utilization of the server;
  • the file data can be encrypted, so that the file data stored in the distributed file system is invisible to the underlying system, thereby Ensure the security of file data; remotely back up file data to achieve remote backup and disaster recovery, and ensure the reliability of important file data.
  • the storage method and the storage device of the distributed file system of the embodiment fully utilize the read and write performance of the distributed file system, thereby improving the user experience.
  • the present embodiment provides a computer readable storage medium storing computer executable instructions arranged to perform the method of any of the above embodiments.
  • the server includes:
  • At least one processor 40 which is exemplified by a processor 40 in FIG. 4; and a memory 41, may further include a communication interface 42 and a bus 43.
  • the processor 40, the memory 41, and the communication interface 42 can complete communication with each other through the bus 43.
  • Communication interface 42 can transmit information.
  • Processor 40 may invoke logic instructions in memory 41 to perform the methods of the above-described embodiments.
  • logic instructions in the memory 41 described above may be implemented in the form of a software functional unit and sold or used as a stand-alone product, and may be stored in a computer readable storage medium.
  • the memory 41 is a computer readable storage medium and can be used to store a software program, a computer executable program, such as a program instruction or a module corresponding to the method in the above embodiment.
  • the processor 40 executes the functional application and the data processing by executing a software program, an instruction or a module stored in the memory 41, that is, implementing the method in the above method embodiment.
  • the memory 41 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application required for at least one function; the storage data area may store data created according to use of the terminal device, and the like.
  • the memory 41 may include a high speed random access memory, and may also include Non-volatile memory.
  • the above embodiments can be implemented by means of software plus a general hardware platform.
  • the above technical solution can be embodied in the form of a software product, which can be stored in a storage medium, such as a read-only memory (ROM), a random access memory (RAM), A disk, an optical disk, etc., comprising one or more instructions for causing a computer device (which may be a personal computer, server communication, or network device, etc.) to perform the methods described in the preceding embodiments or embodiments.
  • ROM read-only memory
  • RAM random access memory
  • a disk an optical disk, etc.
  • the distributed file system storage method and storage device can uniformly manage the distributed file system, improve storage utilization, and improve user experience.

Abstract

A storage method and storage device for a distributed file system. The method comprises: aggregating configuration information about all the distributed file systems, the configuration information comprising a policy attribute for storing file data, a mounting point path of the distributed file system and an interface through which the distributed file system communicates with a server; when a write request for the file data is received, searching for the policy attribute matching the file data according to a pre-set storage policy; and calling the interface of the distributed file system corresponding to the policy attribute matching the file data, writing the file data into the corresponding distributed file system through the interface, and recording a mapping relationship between the path of the file data and the mounting point path of the distributed file system.

Description

分布式文件系统的存储方法和存储装置Distributed file system storage method and storage device 技术领域Technical field
本公开涉及通信技术领域,例如涉及一种分布式文件系统的存储方法和存储装置。The present disclosure relates to the field of communication technologies, for example, to a storage method and storage device of a distributed file system.
背景技术Background technique
包括彩铃、无线应用协议(Wireless Application Protocol,WAP)网关等产品中都用到了网络存储设备。为了达到大容量、高吞吐量和高可靠性的要求,网络存储设备的价格也是呈现指数级上升,这些设备在整个系统的中的成本往往都超过了50%。谷歌(Google)的谷歌文件系统(Google File System,GFS)就是其自己研发的文件存储技术,GFS使系统的成本得到了大量的下降。Network storage devices are used in products such as CRBT and Wireless Application Protocol (WAP) gateways. In order to meet the requirements of high capacity, high throughput and high reliability, the price of network storage devices has also increased exponentially, and the cost of these devices in the entire system often exceeds 50%. Google's Google File System (GFS) is its own file storage technology, and GFS has reduced the cost of the system.
分布式文件系统是一个通用存储软件平台,运行在通用硬件之上,为需要存储服务的产品提供存储平台支撑,提供存储、查询检索和管理产品所产生的海量数据服务(如多媒体内容存储、业务数据存储等)。在廉价的通用硬件平台上构建分布式文件系统已经是多种存储类业务发展的必然趋势。分布式文件系统平台中的分布式文件系统采用分布式非对称的软件架构。分布式文件系统实现功能和特性的分离,并用于完成数据的存取、访问与管理等功能。分布式文件系统平台的模块划分清楚,接口明确,便于在此平台基础上开发或运用其他产品和项目。Distributed File System is a general-purpose storage software platform that runs on general-purpose hardware and provides storage platform support for products that need storage services. It provides massive data services (such as multimedia content storage and services) generated by storage, query retrieval and management products. Data storage, etc.). Building a distributed file system on a cheap general-purpose hardware platform has become an inevitable trend in the development of various storage-type businesses. The distributed file system in the distributed file system platform adopts a distributed asymmetric software architecture. The distributed file system implements the separation of functions and features, and is used to complete data access, access and management functions. The distributed file system platform has clear modules and clear interfaces, making it easy to develop or use other products and projects based on this platform.
分布式文件系统在视频和邮件等多个领域得到广泛应用,但是在分布式文件系统的使用过程中,还是有多种复杂的场景制约分布式文件系统的使用。例如,在分布式文件系统的使用过程中,数据的高可靠性是用户最关心的方面,一套分布式文件系统在使用过程中不可避免的会产生数据一致性问题。此外,不同的分布式文件系统有着不同的存储副本的策略,不同的分布式文件系统对于不同大小文件的存储性能又有不同的侧重,有的分布式文件系统的小文件存储的每秒读写操作的次数(Input/Output Operation Per Second,IOPS)高,大文件系统能差;有的大文件传输的带宽高,但是小文件存储的IOPS低。在分布式文件系统内部的存储域和多个分布式文件系统之间不同的存储域采用不同的存储命名空间,文件数据共享困难。不同大小的文件没有一个统一的存储策略导 致存储空间浪费等。用户替换已有的分布式文件系统时,经常面临着组网替换、服务器通信替换、软件替换以及运维更换等一系列技术难点。Distributed file systems are widely used in many fields such as video and mail. However, in the process of using distributed file systems, there are still many complicated scenarios that restrict the use of distributed file systems. For example, in the use of distributed file systems, the high reliability of data is the most concerned aspect of users. A set of distributed file systems will inevitably produce data consistency problems during use. In addition, different distributed file systems have different strategies for storing copies. Different distributed file systems have different focuses on the storage performance of files of different sizes. Some distributed file systems store and read small files per second. The number of operations (Input/Output Operation Per Second, IOPS) is high, and the large file system can be poor; some large file transfers have high bandwidth, but small file storage has low IOPS. Different storage domains between storage domains inside a distributed file system and multiple distributed file systems use different storage namespaces, and file data sharing is difficult. Different sizes of files do not have a unified storage strategy guide Waste of storage space, etc. When users replace existing distributed file systems, they often face a series of technical difficulties such as network replacement, server communication replacement, software replacement, and operation and maintenance replacement.
发明内容Summary of the invention
一种分布式文件系统的存储方法和存储装置,能够统一管理分布式文件系统,充分发挥分布式文件系统存储性能,提升存储利用率,提高用户体验。A storage method and storage device for a distributed file system can uniformly manage a distributed file system, fully utilize the storage performance of the distributed file system, improve storage utilization, and improve user experience.
一种分布式文件系统的存储方法,应用于服务器,包括:A storage method for a distributed file system, applied to a server, comprising:
聚合所有分布式文件系统的配置信息,所述配置信息包括存储文件数据的策略属性、分布式文件系统的挂载点路径和分布式文件系统与服务器通信的接口;Aggregating configuration information of all distributed file systems, the configuration information including a policy attribute of storing file data, a mount point path of the distributed file system, and an interface of the distributed file system to communicate with the server;
当接收到文件数据的写请求时,根据预先设置的存储策略查找和所述文件数据匹配的策略属性;以及When receiving a write request of the file data, searching for a policy attribute matching the file data according to a preset storage policy;
调用和所述文件数据匹配的策略属性对应的分布式文件系统的接口,将所述文件数据通过所述接口写入所述对应的分布式文件系统中,并记录文件数据的路径和分布式文件系统的挂载点路径的映射关系。Calling an interface of the distributed file system corresponding to the policy attribute matched by the file data, writing the file data to the corresponding distributed file system through the interface, and recording a path of the file data and a distributed file The mapping relationship of the mount point path of the system.
可选的,所述聚合所有分布式文件系统的配置信息,包括:Optionally, the aggregating configuration information of all distributed file systems includes:
当检测到分布式文件系统时,对分布式文件系统的策略属性、分布式文件系统的挂载点路径和分布式文件系统的接口进行配置,所述分布式文件系统包括已存在的分布式文件系统和新增的分布式文件系统;以及When a distributed file system is detected, the policy attributes of the distributed file system, the mount point path of the distributed file system, and the interface of the distributed file system are included, the distributed file system including the existing distributed file System and new distributed file system;
在对分布式文件系统配置完成后,聚合所有分布式系统的配置信息。After the configuration of the distributed file system is completed, the configuration information of all distributed systems is aggregated.
可选的,所述方法还包括:Optionally, the method further includes:
当接收到文件数据的读请求时,根据所述文件数据的路径和分布式文件系统的挂载点路径的映射关系,调用和所述文件数据对应的分布式文件系统的接口,并通过所述接口读取所述对应的分布式文件系统中的文件数据。And when the read request of the file data is received, the interface of the distributed file system corresponding to the file data is invoked according to the mapping relationship between the path of the file data and the mount point path of the distributed file system, and The interface reads the file data in the corresponding distributed file system.
可选的,所述方法还包括:当接收到文件数据的写请求时,对需要写入的文件数据进行加密;以及Optionally, the method further includes: when receiving the write request of the file data, encrypting the file data that needs to be written;
所述方法还包括:当接收到文件数据的读请求时,对需要读出的经过加密的文件数据进行解密。 The method also includes decrypting the encrypted file data that needs to be read when a read request for the file data is received.
可选的,所述方法还包括:Optionally, the method further includes:
根据聚合的所有分布式文件系统的配置信息,确定除所述文件数据写入对应的分布式文件系统之外的其他分布式文件系统;以及Determining, according to configuration information of all distributed file systems of the aggregation, a distributed file system other than the file data written to the corresponding distributed file system;
调用所述其他分布式文件系统的接口,将写入对应的分布式文件系统的文件数据远程备份到所述其他分布式文件系统中。The interface of the other distributed file system is invoked to remotely back up file data written to the corresponding distributed file system to the other distributed file system.
可选的,所述方法还包括:Optionally, the method further includes:
当接收到文件数据的删除请求时,根据所述文件数据的路径和分布式文件系统的挂载点路径的映射关系,调用和所述文件数据对应的分布式文件系统的接口,从所述对应的分布式文件系统中删除所述文件数据;以及When receiving the deletion request of the file data, the interface of the distributed file system corresponding to the file data is invoked according to the mapping relationship between the path of the file data and the mount point path of the distributed file system, from the corresponding The file data is deleted from the distributed file system;
如果所述文件数据在所述其他分布式系统中有远程备份,则从远程备份的所述其他分布式文件系统中删除所述文件数据。If the file data has a remote backup in the other distributed system, the file data is deleted from the other distributed file system of the remote backup.
一种分布式文件系统的存储装置,部署在服务器上,包括:A storage device for a distributed file system, deployed on a server, comprising:
系统扩展性模块,设置为聚合所有分布式文件系统的配置信息,所述配置信息包括存储文件数据的策略属性、分布式文件系统的挂载点路径和分布式文件系统与服务器通信的接口;a system extensibility module, configured to aggregate configuration information of all distributed file systems, where the configuration information includes a policy attribute for storing file data, a mount point path of the distributed file system, and an interface for the distributed file system to communicate with the server;
策略选择器,设置为当接收到文件数据的写请求时,根据预先设置的存储策略查找和所述文件数据匹配的策略属性;以及a policy selector configured to: when receiving a write request for file data, find a policy attribute that matches the file data according to a preset storage policy;
数据分发模块,设置为调用和所述文件数据匹配的策略属性对应的分布式文件系统的接口,通过所述接口将所述文件数据写入所述对应的分布式文件系统中,并记录文件数据的路径和分布式文件系统的挂载点路径的映射关系。a data distribution module, configured to invoke an interface of a distributed file system corresponding to a policy attribute matched by the file data, by using the interface to write the file data into the corresponding distributed file system, and recording file data The mapping between the path and the mount point path of the distributed file system.
可选的,所述系统扩展性模块设置为:Optionally, the system extensibility module is set to:
当检测到分布式文件系统时,对分布式文件系统的策略属性、分布式文件系统的挂载点路径和分布式文件系统的接口进行配置,所述分布式文件系统包括已存在的分布式文件系统和新增的分布式文件系统;以及When a distributed file system is detected, the policy attributes of the distributed file system, the mount point path of the distributed file system, and the interface of the distributed file system are included, the distributed file system including the existing distributed file System and new distributed file system;
在对分布式文件系统配置完成后,聚合所有分布式系统的配置信息。After the configuration of the distributed file system is completed, the configuration information of all distributed systems is aggregated.
可选的,所述数据分发模块,还设置为:Optionally, the data distribution module is further configured to:
当接收到文件数据的读请求时,根据所述文件数据的路径和分布式文件系统的挂载点路径的映射关系,调用和所述文件数据对应的分布式文件系统的接 口,并通过所述接口读取所述对应的分布式文件系统中的文件数据。When receiving the read request of the file data, calling the distributed file system corresponding to the file data according to the mapping relationship between the path of the file data and the mount point path of the distributed file system And reading file data in the corresponding distributed file system through the interface.
可选的,所述装置还包括:Optionally, the device further includes:
数据加密和解密模块,设置为当接收到文件数据的写请求时,对需要写入的文件数据进行加密;以及当接收到文件数据的读请求时,对需要读出的经过加密的文件数据进行解密。a data encryption and decryption module configured to encrypt file data to be written when a write request for file data is received; and to perform encrypted read file data to be read when receiving a read request for file data Decrypt.
可选的,所述数据分发模块还设置为:Optionally, the data distribution module is further configured to:
根据聚合的所有分布式文件系统的配置信息,确定除所述文件数据写入对应的分布式文件系统之外的其他分布式文件系统;以及Determining, according to configuration information of all distributed file systems of the aggregation, a distributed file system other than the file data written to the corresponding distributed file system;
调用所述其他分布式文件系统的接口,将写入对应的分布式文件系统的文件数据远程备份到所述其他分布式文件系统中。The interface of the other distributed file system is invoked to remotely back up file data written to the corresponding distributed file system to the other distributed file system.
可选的,所述数据分发模块还设置为:Optionally, the data distribution module is further configured to:
当接收到文件数据的删除请求时,根据所述文件数据的路径和分布式文件系统的挂载点路径的映射关系,调用和所述文件数据对应的分布式文件系统的接口,从所述对应的分布式文件系统中删除所述文件数据;以及When receiving the deletion request of the file data, the interface of the distributed file system corresponding to the file data is invoked according to the mapping relationship between the path of the file data and the mount point path of the distributed file system, from the corresponding The file data is deleted from the distributed file system;
如果所述文件数据在所述其他分布式系统中有远程备份,则从远程备份的所述其他分布式文件系统中删除所述文件数据。If the file data has a remote backup in the other distributed system, the file data is deleted from the other distributed file system of the remote backup.
一种计算机可读存储介质,存储有计算机可执行指令,所述计算机可执行指令设置为执行上述方法。A computer readable storage medium storing computer executable instructions arranged to perform the above method.
一种服务器,包括:A server that includes:
至少一个处理器;以及At least one processor;
与所述至少一个处理器通信连接的存储器;其中,a memory communicatively coupled to the at least one processor; wherein
所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器执行上述的方法。The memory stores instructions executable by the at least one processor, the instructions being executed by the at least one processor to cause the at least one processor to perform the method described above.
分布式文件系统存储方法和存储装置,通过对分布式文件系统的配置信息进行聚合操作,实现了对不同分布式文件系统实行统一调度,进行文件数据的写入、读出和删除;预先配置存储策略,并根据存储策略进行文件数据的写入,提升了存储服务器通信存储利用率;可以对文件数据进行加密,使得存储在分布式文件系统的文件数据对于底层系统不可见,从而保证文件数据的安全性; 可以对文件数据进行远程备份,实现远程备份容灾,保证重要文件数据的可靠性。因此,分布式文件系统存储方法和存储装置充分发挥分布式文件系统读写性能,提升了用户体验。The distributed file system storage method and the storage device perform unified operation on different distributed file systems by performing aggregation operation on the configuration information of the distributed file system, and perform file data writing, reading, and deleting; pre-configured storage The policy and the writing of the file data according to the storage policy improve the communication storage utilization of the storage server; the file data can be encrypted, so that the file data stored in the distributed file system is invisible to the underlying system, thereby ensuring the file data. safety; You can remotely back up file data to implement remote backup and disaster recovery and ensure the reliability of important file data. Therefore, the distributed file system storage method and storage device fully utilize the distributed file system read and write performance and improve the user experience.
附图说明DRAWINGS
附图用来提供对技术方案的理解,并且构成说明书的一部分,与以下实施例一起解释技术方案,并不构成对技术方案的限制。The drawings are intended to provide an understanding of the technical solutions, and constitute a part of the specification, and the technical solutions are explained together with the following embodiments, and do not constitute a limitation of the technical solutions.
图1为一实施方式中分布式文件系统的存储装置的结构示意图;1 is a schematic structural diagram of a storage device of a distributed file system in an embodiment;
图2为一实施方式中分布式文件系统存储方法的流程示意图;2 is a schematic flowchart of a method for storing a distributed file system in an embodiment;
图3为一实施方式中分布式文件系统存储方法的步骤示意图;以及3 is a schematic diagram showing the steps of a distributed file system storage method in an embodiment;
图4是一实施方式中服务器的硬件结构示意图。4 is a schematic diagram showing the hardware structure of a server in an embodiment.
具体实施方式detailed description
为使技术方案更加清楚明白,下文中将结合附图对实施例进行详细说明。在不冲突的情况下,以下实施例及实施例中的特征可以相互任意组合。In order to make the technical solutions clearer, the embodiments will be described in detail below with reference to the accompanying drawings. The features in the following embodiments and examples may be arbitrarily combined with each other without conflict.
在附图的流程图示出的步骤可以在诸如一组执行计算机可执行指令的计算机系统中执行。虽然在流程图中示出了逻辑顺序,但是在一些情况下,可以以不同于以下实施例中的顺序执行所示出或描述的步骤。The steps illustrated in the flowchart of the figures may be executed in a computer system such as a set of executing computer executable instructions. Although a logical order is shown in the flowcharts, in some cases, the steps shown or described may be performed in a different order than in the following embodiments.
图1为本实施方式中分布式文件系统的存储装置的结构示意图。如图1所示,该装置可以部署在服务器上。通过对至少两个分布式文件系统,至少两个存储域以及至少两个命名空间的文件系统统一调度,实现文件的存储,实现至少两个分布式文件系统之间的高扩展性、高可用性、远程备份、策略生成以及数据加密和解密等功能。FIG. 1 is a schematic structural diagram of a storage device of a distributed file system in the embodiment. As shown in Figure 1, the device can be deployed on a server. The file storage is implemented by uniformly scheduling at least two distributed file systems, at least two storage domains, and at least two namespace file systems, so as to achieve high scalability and high availability between at least two distributed file systems. Remote backup, policy generation, and data encryption and decryption.
本实施例的分布式文件系统的存储装置可以包括:系统扩展性模块、策略选择器以及数据分发模块。The storage device of the distributed file system of this embodiment may include: a system extensibility module, a policy selector, and a data distribution module.
系统扩展性模块设置为聚合所有分布式文件系统的配置信息,所述配置信息包括存储文件数据的策略属性、分布式文件系统的挂载点路径和分布式文件系统与服务器通信的接口。The system extensibility module is configured to aggregate configuration information of all distributed file systems, the configuration information including policy attributes for storing file data, a mount point path of the distributed file system, and an interface for the distributed file system to communicate with the server.
所述存储文件数据的策略属性可以包括大文件和小文件的分布,冷数据和 热数据的分布以及副本数等存储策略。其中,大文件可以是文件容量大于1MB的文件,小文件可以是文件容量小于或等于1MB的文件;热数据可以是用户使用频繁的数据,冷数据可以是用户使用频率低的数据。The policy attributes of the stored file data may include distribution of large files and small files, cold data and Storage strategy such as distribution of hot data and number of copies. The large file may be a file with a file capacity greater than 1 MB, and the small file may be a file with a file capacity less than or equal to 1 MB; the hot data may be data frequently used by the user, and the cold data may be data with a low frequency of use by the user.
可选地,系统扩展性模块设置为当检测到分布式文件系统时,对分布式文件系统的策略属性、分布式文件系统的挂载点路径和分布式文件系统的接口进行配置,所述分布式文件系统包括已存在的分布式文件系统和新增的分布式文件系统;以及在对分布式文件系统配置完成后,聚合所有分布式系统的配置信息,并通知数据分发模块,以便由数据分发模块统一调度。Optionally, the system extensibility module is configured to configure a policy attribute of the distributed file system, a mount point path of the distributed file system, and an interface of the distributed file system when the distributed file system is detected, the distribution File system includes the existing distributed file system and the new distributed file system; and after the configuration of the distributed file system is completed, the configuration information of all distributed systems is aggregated, and the data distribution module is notified for data distribution. The module is uniformly scheduled.
策略选择器设置为当接收到文件数据的写请求时,根据预先设置的存储策略查找和所述文件数据匹配的策略属性。The policy selector is configured to, when receiving a write request of the file data, find a policy attribute that matches the file data according to a preset storage policy.
可选地,预先设置的存储策略可以是根据文件数据的大小或文件数据的类型选择分布式文件系统。Optionally, the preset storage policy may be selecting a distributed file system according to the size of the file data or the type of the file data.
例如,对于不同大小的文件数据,小文件数据存储在IOPS高的分布式文件系统中,大文件数据存储在带宽高的分布式文件系统中,从而充分发挥存储系统的存储性能。For example, for file data of different sizes, small file data is stored in a distributed file system with high IOPS, and large file data is stored in a distributed file system with high bandwidth, thereby fully utilizing the storage performance of the storage system.
例如,对于不同类型的文件数据,临时文件数据可以存储在单副本分布式文件系统中,减少临时文件数据对文件系统存储性能的冲击;重要文件数据可以存放多副本的分布式文件系统中,或者通过数据分发模块实现远程备份容灾,保证重要文件数据的可靠性。For example, for different types of file data, temporary file data can be stored in a single-copy distributed file system, reducing the impact of temporary file data on file system storage performance; important file data can be stored in multiple copies of a distributed file system, or Remote backup and disaster recovery through data distribution modules ensures the reliability of important file data.
数据分发模块设置为调用和所述文件数据匹配的策略属性对应分布式文件系统的接口,通过该接口将所述文件数据写入所述对应的分布式文件系统中,记录文件数据的路径信息和分布式文件系统的挂载点路径信息的映射关系,从而便于后续文件数据分发。The data distribution module is configured to invoke a policy attribute matching the file data corresponding to an interface of the distributed file system, by using the interface to write the file data into the corresponding distributed file system, and recording path information of the file data and The mapping relationship of the mount point path information of the distributed file system, thereby facilitating subsequent file data distribution.
可选地,和该数据分发模块通信的接口遵循统一的可移植操作系统接口(Portable Operating System Interface,POSIX),POSIX标准定义了操作系统为应用程序提供的接口标准。Optionally, the interface that communicates with the data distribution module follows a unified Portable Operating System Interface (POSIX), which defines an interface standard provided by the operating system for the application.
当客户端调用输入输出(Input Output,IO)接口时,该数据分发模块可以调用至少两个分布式文件系统的接口来读写文件数据,记录文件数据的路径信息和分布式文件系统的挂载点路径信息的映射关系,从而便于后续文件数据分 发。When the client calls the Input Output (IO) interface, the data distribution module can call at least two interfaces of the distributed file system to read and write file data, record the path information of the file data, and mount the distributed file system. Point mapping of path information to facilitate subsequent file data segmentation hair.
根据需要,例如根据文件数据的重要性,数据分发模块还可以将重要文件数据同步或异步备份到其他的分布式文件系统,以此来实现重要文件数据的远程备份。该数据分发模块还可以设置为当接收到文件数据的删除请求时,根据所述文件数据的路径和分布式文件系统的挂载点路径的映射关系,从对应的分布式文件系统中删除所述文件数据;以及如果所述文件数据在所述其他分布式系统中有远程备份,则从远程备份的分布式文件系统中删除所述文件数据。According to the needs, for example, according to the importance of the file data, the data distribution module can also back up important file data to other distributed file systems synchronously or asynchronously, thereby realizing remote backup of important file data. The data distribution module may be further configured to: when receiving the deletion request of the file data, deleting the mapping from the corresponding distributed file system according to the mapping relationship between the path of the file data and the mount point path of the distributed file system File data; and if the file data has a remote backup in the other distributed system, the file data is deleted from the remotely backed up distributed file system.
可选的,分布式文件系统的存储装置包括数据加密和解密模块,数据加密和解密模块设置为当接收到文件数据的写请求时,对需要写入的文件数据进行加密;以及当接收到文件数据的读请求时,对需要读出的经过加密的文件数据进行解密。Optionally, the storage device of the distributed file system includes a data encryption and decryption module, and the data encryption and decryption module is configured to encrypt the file data to be written when receiving the write request of the file data; and when the file is received When the data is read, the encrypted file data that needs to be read is decrypted.
可选地,该数据加密和解密模块提供加密和解密算法,当接收到文件数据的写请求时,对需要写入的文件数据进行加密,当接收到文件数据的读请求时,对需要读出的经过加密的文件数据进行解密,使得存储在分布式文件系统的文件数据对于底层系统不可见,从而保证文件数据的安全性。Optionally, the data encryption and decryption module provides an encryption and decryption algorithm, and when receiving the write request of the file data, encrypts the file data that needs to be written, and when receiving the read request of the file data, the pair needs to read out The encrypted file data is decrypted, so that the file data stored in the distributed file system is invisible to the underlying system, thereby ensuring the security of the file data.
在本实施例中,硬件的基本配置可以如下:中央处理器(Central Processing Unit,CPU)以及固态硬盘(Solid State Drives,SSD)。CPU为因特尔(Intel)E5-2620,该CPU包括6核心,12线程,CPU主频为2500MHZ或更高频率。SSD记录文件数据的路径信息和分布式文件系统的挂载点路径信息的映射关系,SSD的内存大于96G字节(Byte)。In this embodiment, the basic configuration of the hardware may be as follows: a central processing unit (CPU) and a solid state drive (SSD). The CPU is an Intel E5-2620, which includes 6 cores, 12 threads, and a CPU clocked at 2500 MHz or higher. The mapping relationship between the path information of the SSD record file data and the mount point path information of the distributed file system, the memory of the SSD is greater than 96 Gbytes (Byte).
基于图1所示的装置,如图2所示,本实施例提供了一种分布式文件系统的存储方法。Based on the apparatus shown in FIG. 1, as shown in FIG. 2, this embodiment provides a storage method of a distributed file system.
在步骤210中,聚合所有分布式文件系统的配置信息,所述配置信息包括存储文件数据的策略属性、分布式文件系统的挂载点路径和分布式文件系统与服务器通信的接口。In step 210, configuration information of all distributed file systems is aggregated, the configuration information including policy attributes for storing file data, a mount point path of the distributed file system, and an interface for the distributed file system to communicate with the server.
在步骤220中,当接收到文件数据的写请求时,根据预先设置的存储策略查找和所述文件数据匹配的策略属性。In step 220, when a write request for file data is received, a policy attribute matching the file data is searched according to a preset storage policy.
在步骤230中,调用和接收的文件数据匹配的策略属性对应的分布式文件系统的接口,将所述接收的文件数据通过所述接口写入所述对应的分布式文件 系统中,并记录文件数据的路径和分布式文件系统的挂载点路径的映射关系。In step 230, the interface of the distributed file system corresponding to the policy attribute matched with the received file data is invoked, and the received file data is written into the corresponding distributed file through the interface. In the system, the mapping relationship between the path of the file data and the mount point path of the distributed file system is recorded.
基于图2所示的方法,如图3所示,本实施例提供了一种分布式文件系统存储方法。Based on the method shown in FIG. 2, as shown in FIG. 3, this embodiment provides a distributed file system storage method.
在步骤310中,系统扩展性模块聚合所有分布式文件系统的配置信息,所述配置信息包括存储文件数据的策略属性、分布式文件系统的挂载点路径和分布式文件系统与服务器通信的接口。In step 310, the system extensibility module aggregates configuration information of all distributed file systems, the configuration information including policy attributes for storing file data, a mount point path of the distributed file system, and an interface for the distributed file system to communicate with the server. .
在步骤310中,系统扩展性模块可以对每个分布式文件系统进行配置,包括对每个分布式文件系统的挂载点路径、策略属性以及接口进行配置。In step 310, the system extensibility module can configure each distributed file system, including configuring the mount point path, policy attributes, and interfaces for each distributed file system.
系统扩展性模块聚合所有分布式文件系统的配置信息,当发现有新增的分布式文件系统时,可以将该新增的分布式文件系统的配置信息加入聚合,其中多个分布式文件系统可以有多个的副本冗余策略,接入的分布式文件系统满足POSIX语义规范即可。The system extensibility module aggregates configuration information of all distributed file systems. When a new distributed file system is found, the configuration information of the newly added distributed file system can be aggregated, and multiple distributed file systems can be configured. There are multiple replica redundancy policies, and the distributed distributed file system can satisfy the POSIX semantic specification.
在配置完成后通知数据分发模块,以便由数据分发模块统一调度。The data distribution module is notified after the configuration is completed, so that it is uniformly scheduled by the data distribution module.
在步骤320中,当接收到文件数据的写请求时,对需要写入的文件数据进行加密。In step 320, when a write request for file data is received, the file data to be written is encrypted.
在步骤320中,当用户进程接收到文件数据的写请求时,数据加密和解密模块判断是否需要对该文件数据进行加密,例如该文件数据是隐私数据,则数据加密和解密模块可以对该文件数据进行加密。In step 320, when the user process receives the write request of the file data, the data encryption and decryption module determines whether the file data needs to be encrypted, for example, the file data is private data, and the data encryption and decryption module can The data is encrypted.
在步骤330中,根据预先设置的存储策略查找和所述文件数据匹配的策略属性。In step 330, a policy attribute matching the file data is searched according to a preset storage policy.
在步骤330中,预先设置的存储策略可以是根据文件数据大小或文件数据类型等选择分布式文件系统。策略选择器可以根据预先设置的存储策略为文件数据选择对应的分布式文件系统。In step 330, the pre-set storage policy may be to select a distributed file system according to file data size or file data type or the like. The policy selector can select a corresponding distributed file system for the file data according to a preset storage policy.
在步骤340中,调用文件数据对应的分布式文件系统的接口,并通过所述接口来读写文件数据。In step 340, the interface of the distributed file system corresponding to the file data is called, and the file data is read and written through the interface.
在步骤340中,数据分发模块可以调用和所述文件数据匹配的策略属性对应的分布式文件系统的接口,将所述文件数据写入所述对应的分布式文件系统中,例如将文件数据写入分布式文件系统1。 In step 340, the data distribution module may call an interface of the distributed file system corresponding to the policy attribute matched by the file data, and write the file data into the corresponding distributed file system, for example, write the file data. Into the distributed file system 1.
在步骤350中,将文件数据远程备份到其他的分布式文件系统。In step 350, the file data is remotely backed up to other distributed file systems.
在步骤350中,根据需要,例如该文件数据为重要文件,则数据分发模块可以将文件数据异步分发到其他的分布式文件系统,例如远程备份到分布式文件系统2中,保证重要文件数据的可靠性。In step 350, the data distribution module can asynchronously distribute the file data to other distributed file systems, such as remote backup to the distributed file system 2, to ensure important file data, as needed, for example, the file data is an important file. reliability.
可选地,根据聚合的所有分布式文件系统的配置信息,确定除所述文件数据写入对应的分布式文件系统外的其他分布式文件系统,调用所述其他分布式文件系统的接口,将文件数据远程备份到其他的分布式文件系统。Optionally, determining, according to configuration information of all distributed file systems that are aggregated, that the file data is written to another distributed file system other than the corresponding distributed file system, and calling the interface of the other distributed file system, File data is backed up remotely to other distributed file systems.
在步骤360中,记录文件数据的路径和分布式文件系统的挂载点路径的映射关系。In step 360, a mapping relationship between the path of the file data and the mount point path of the distributed file system is recorded.
在步骤360中,数据分发模块还可以记录文件数据的路径和分布式文件系统的挂载点路径的映射关系,便于后续文件数据分发。In step 360, the data distribution module may also record the mapping relationship between the path of the file data and the mount point path of the distributed file system to facilitate subsequent file data distribution.
在步骤370中,当接收到文件数据的读请求时,根据文件数据的路径和分布式文件系统的挂载点路径的映射关系,调用和所述文件数据对应的分布式文件系统的接口,并通过所述接口从对应的分布式文件系统中读取文件数据。In step 370, when receiving the read request of the file data, according to the mapping relationship between the path of the file data and the mount point path of the distributed file system, the interface of the distributed file system corresponding to the file data is invoked, and File data is read from the corresponding distributed file system through the interface.
在步骤370中,当用户进程接收到文件数据的读请求时,数据分发模块查找之前记录的文件数据的路径和分布式文件系统的挂载点路径的映射关系,确定对应的分布式文件系统,并从该对应的分布式文件系统中读取文件数据。In step 370, when the user process receives the read request of the file data, the data distribution module searches for the mapping relationship between the path of the previously recorded file data and the mount point path of the distributed file system, and determines the corresponding distributed file system. And reading file data from the corresponding distributed file system.
在步骤380中,对需要读出的文件数据进行解密。In step 380, the file data that needs to be read is decrypted.
在步骤380中,如果需要读出的文件数据是加密数据,则数据加密和解密模块对该文件数据进行解密。In step 380, if the file data to be read is encrypted data, the data encryption and decryption module decrypts the file data.
在步骤390中,读出文件数据。In step 390, the file data is read.
在步骤390中,向用户进程返回需要读出的文件数据。In step 390, the file data that needs to be read is returned to the user process.
分布式文件系统存储方法还可以包括:The distributed file system storage method may further include:
当用户进程接收到文件数据的删除请求时,数据分发模块查找之前记录的文件数据的路径和分布式文件系统的挂载点路径的映射关系,确定对应的分布式文件系统,调用和所述文件数据对应的分布式文件系统的接口,从该对应的分布式文件系统中删除该文件数据;以及When the user process receives the deletion request of the file data, the data distribution module searches for the mapping relationship between the path of the previously recorded file data and the mount point path of the distributed file system, and determines the corresponding distributed file system, the call and the file. An interface of the distributed file system corresponding to the data, the file data is deleted from the corresponding distributed file system;
如果该文件数据有远程备份到其他的分布式文件系统,数据分发模块还需 要根据聚合的所有分布式文件系统的配置信息,确定除所述文件数据写入的对应的分布式文件系统外的其他分布式文件系统;调用所述其他分布式文件系统的接口,从远程备份的其他分布式文件系统中删除该文件数据。If the file data has remote backup to other distributed file systems, the data distribution module needs to Determining, according to configuration information of all distributed file systems of the aggregation, other distributed file systems other than the corresponding distributed file system written by the file data; calling the interfaces of the other distributed file systems, and remotely backing up The file data is deleted in other distributed file systems.
本实施例的分布式文件系统的存储方法和存储装置,通过对分布式文件系统的配置信息进行聚合操作,对至少两个分布式文件系统实行统一调度,进行文件数据的写入、读出和删除;预先配置存储策略,并根据存储策略进行文件数据的写入,提升了服务器的存储利用率;可以对文件数据进行加密,使得存储在分布式文件系统的文件数据对于底层系统不可见,从而保证文件数据的安全性;可以对文件数据进行远程备份,实现远程备份容灾,保证重要文件数据的可靠性。The storage method and the storage device of the distributed file system of the embodiment perform unified operation on at least two distributed file systems by performing aggregation operation on the configuration information of the distributed file system, and perform file data writing and reading. Delete; pre-configure the storage policy and write the file data according to the storage policy, which improves the storage utilization of the server; the file data can be encrypted, so that the file data stored in the distributed file system is invisible to the underlying system, thereby Ensure the security of file data; remotely back up file data to achieve remote backup and disaster recovery, and ensure the reliability of important file data.
因此,本实施例的分布式文件系统的存储方法和存储装置充分利用分布式文件系统的读写性能,提升了用户体验。Therefore, the storage method and the storage device of the distributed file system of the embodiment fully utilize the read and write performance of the distributed file system, thereby improving the user experience.
本实施例提供了一种计算机可读存储介质,存储有计算机可执行指令,所述计算机可执行指令设置为执行上述任一实施例中的方法。The present embodiment provides a computer readable storage medium storing computer executable instructions arranged to perform the method of any of the above embodiments.
本实施例提供了一种服务器的硬件结构示意图。参见图4,该服务器包括:This embodiment provides a schematic diagram of a hardware structure of a server. Referring to Figure 4, the server includes:
至少一个处理器(processor)40,图4中以一个处理器40为例;以及存储器(memory)41,还可以包括通信接口(Communications Interface)42和总线43。其中,处理器40、存储器41以及通信接口42可以通过总线43完成相互间的通信。通信接口42可以传输信息。处理器40可以调用存储器41中的逻辑指令,以执行上述实施例的方法。At least one processor 40, which is exemplified by a processor 40 in FIG. 4; and a memory 41, may further include a communication interface 42 and a bus 43. The processor 40, the memory 41, and the communication interface 42 can complete communication with each other through the bus 43. Communication interface 42 can transmit information. Processor 40 may invoke logic instructions in memory 41 to perform the methods of the above-described embodiments.
此外,上述的存储器41中的逻辑指令可以通过软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。In addition, the logic instructions in the memory 41 described above may be implemented in the form of a software functional unit and sold or used as a stand-alone product, and may be stored in a computer readable storage medium.
存储器41作为一种计算机可读存储介质,可用于存储软件程序、计算机可执行程序,如上述实施例中的方法对应的程序指令或模块。处理器40通过运行存储在存储器41中的软件程序、指令或模块,从而执行功能应用以及数据处理,即实现上述方法实施例中的方法。The memory 41 is a computer readable storage medium and can be used to store a software program, a computer executable program, such as a program instruction or a module corresponding to the method in the above embodiment. The processor 40 executes the functional application and the data processing by executing a software program, an instruction or a module stored in the memory 41, that is, implementing the method in the above method embodiment.
存储器41可包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序;存储数据区可存储根据终端设备的使用所创建的数据等。此外,存储器41可以包括高速随机存取存储器,还可以包括 非易失性存储器。The memory 41 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application required for at least one function; the storage data area may store data created according to use of the terminal device, and the like. In addition, the memory 41 may include a high speed random access memory, and may also include Non-volatile memory.
上述实施例可借助软件加通用硬件平台的方式来实现。以上技术方案本质上可以以软件产品的形式体现出来,该计算机软件产品可以存储在存储介质中,如只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟、光盘等,包括一个或多个指令用以使得一台计算机设备(可以是个人计算机,服务器通信,或者网络设备等)执行上述实施方式或者实施方式的部分所述的方法。The above embodiments can be implemented by means of software plus a general hardware platform. The above technical solution can be embodied in the form of a software product, which can be stored in a storage medium, such as a read-only memory (ROM), a random access memory (RAM), A disk, an optical disk, etc., comprising one or more instructions for causing a computer device (which may be a personal computer, server communication, or network device, etc.) to perform the methods described in the preceding embodiments or embodiments.
工业实用性Industrial applicability
分布式文件系统的存储方法和存储装置,能够统一管理分布式文件系统,提升了存储利用率,提高了用户体验。 The distributed file system storage method and storage device can uniformly manage the distributed file system, improve storage utilization, and improve user experience.

Claims (13)

  1. 一种分布式文件系统的存储方法,应用于服务器,包括:A storage method for a distributed file system, applied to a server, comprising:
    聚合所有分布式文件系统的配置信息,所述配置信息包括存储文件数据的策略属性、分布式文件系统的挂载点路径和分布式文件系统与服务器通信的接口:Aggregating configuration information of all distributed file systems, the configuration information including a policy attribute of storing file data, a mount point path of the distributed file system, and an interface of the distributed file system to communicate with the server:
    当接收到文件数据的写请求时,根据预先设置的存储策略查找和所述文件数据匹配的策略属性;以及When receiving a write request of the file data, searching for a policy attribute matching the file data according to a preset storage policy;
    调用和所述文件数据匹配的策略属性对应的分布式文件系统的接口,将所述文件数据通过所述接口写入所述对应的分布式文件系统中,并记录文件数据的路径和分布式文件系统的挂载点路径的映射关系。Calling an interface of the distributed file system corresponding to the policy attribute matched by the file data, writing the file data to the corresponding distributed file system through the interface, and recording a path of the file data and a distributed file The mapping relationship of the mount point path of the system.
  2. 根据权利要求1所述的方法,其中,所述聚合所有分布式文件系统的配置信息,包括:The method of claim 1 wherein said aggregating configuration information for all distributed file systems comprises:
    当检测到分布式文件系统时,对分布式文件系统的策略属性、分布式文件系统的挂载点路径和分布式文件系统的接口进行配置,所述分布式文件系统包括已存在的分布式文件系统和新增的分布式文件系统;以及When a distributed file system is detected, the policy attributes of the distributed file system, the mount point path of the distributed file system, and the interface of the distributed file system are included, the distributed file system including the existing distributed file System and new distributed file system;
    在对分布式文件系统配置完成后,聚合所有分布式系统的配置信息。After the configuration of the distributed file system is completed, the configuration information of all distributed systems is aggregated.
  3. 根据权利要求1所述的方法,还包括:The method of claim 1 further comprising:
    当接收到文件数据的读请求时,根据所述文件数据的路径和分布式文件系统的挂载点路径的映射关系,调用和所述文件数据对应的分布式文件系统的接口,并通过所述接口读取所述对应的分布式文件系统中的文件数据。And when the read request of the file data is received, the interface of the distributed file system corresponding to the file data is invoked according to the mapping relationship between the path of the file data and the mount point path of the distributed file system, and The interface reads the file data in the corresponding distributed file system.
  4. 根据权利要求3所述的方法,还包括:当接收到文件数据的写请求时,对需要写入的文件数据进行加密;以及The method of claim 3, further comprising: encrypting the file data to be written when receiving the write request of the file data;
    当接收到文件数据的读请求时,对需要读出的经过加密的文件数据进行解密。 When a read request for file data is received, the encrypted file data that needs to be read is decrypted.
  5. 根据权利要求1所述的方法,还包括:The method of claim 1 further comprising:
    根据聚合的所有分布式文件系统的配置信息,确定除所述文件数据写入对应的分布式文件系统之外的其他分布式文件系统;以及Determining, according to configuration information of all distributed file systems of the aggregation, a distributed file system other than the file data written to the corresponding distributed file system;
    调用所述其他分布式文件系统的接口,将写入对应的分布式文件系统的文件数据远程备份到所述其他分布式文件系统中。The interface of the other distributed file system is invoked to remotely back up file data written to the corresponding distributed file system to the other distributed file system.
  6. 根据权利要求5所述的方法,还包括:The method of claim 5 further comprising:
    当接收到文件数据的删除请求时,根据所述文件数据的路径和分布式文件系统的挂载点路径的映射关系,调用和所述文件数据对应的分布式文件系统的接口,从所述对应的分布式文件系统中删除所述文件数据;以及When receiving the deletion request of the file data, the interface of the distributed file system corresponding to the file data is invoked according to the mapping relationship between the path of the file data and the mount point path of the distributed file system, from the corresponding The file data is deleted from the distributed file system;
    如果所述文件数据在所述其他分布式系统中有远程备份,则从远程备份的所述其他分布式文件系统中删除所述文件数据。If the file data has a remote backup in the other distributed system, the file data is deleted from the other distributed file system of the remote backup.
  7. 一种分布式文件系统的存储装置,部署在服务器上,包括:A storage device for a distributed file system, deployed on a server, comprising:
    系统扩展性模块,设置为聚合所有分布式文件系统的配置信息,所述配置信息包括存储文件数据的策略属性、分布式文件系统的挂载点路径和分布式文件系统与服务器通信的接口;a system extensibility module, configured to aggregate configuration information of all distributed file systems, where the configuration information includes a policy attribute for storing file data, a mount point path of the distributed file system, and an interface for the distributed file system to communicate with the server;
    策略选择器,设置为当接收到文件数据的写请求时,根据预先设置的存储策略查找和所述文件数据匹配的策略属性;以及a policy selector configured to: when receiving a write request for file data, find a policy attribute that matches the file data according to a preset storage policy;
    数据分发模块,设置为调用和所述文件数据匹配的策略属性对应的分布式文件系统的接口,通过所述接口将所述文件数据写入所述对应的分布式文件系统中,并记录文件数据的路径和分布式文件系统的挂载点路径的映射关系。a data distribution module, configured to invoke an interface of a distributed file system corresponding to a policy attribute matched by the file data, by using the interface to write the file data into the corresponding distributed file system, and recording file data The mapping between the path and the mount point path of the distributed file system.
  8. 根据权利要求7所述的装置,其中,所述系统扩展性模块设置为:The apparatus of claim 7, wherein the system extensibility module is configured to:
    当检测到分布式文件系统时,对分布式文件系统的策略属性、分布式文件系统的挂载点路径和分布式文件系统的接口进行配置,所述分布式文件系统包 括已存在的分布式文件系统和新增的分布式文件系统;以及When a distributed file system is detected, the policy attributes of the distributed file system, the mount point path of the distributed file system, and the interface of the distributed file system are configured, the distributed file system package Including existing distributed file systems and new distributed file systems;
    在对分布式文件系统配置完成后,聚合所有分布式系统的配置信息。After the configuration of the distributed file system is completed, the configuration information of all distributed systems is aggregated.
  9. 根据权利要求7所述的装置,其中,所述数据分发模块,还设置为:The device according to claim 7, wherein the data distribution module is further configured to:
    当接收到文件数据的读请求时,根据所述文件数据的路径和分布式文件系统的挂载点路径的映射关系,调用和所述文件数据对应的分布式文件系统的接口,并通过所述接口读取所述对应的分布式文件系统中的文件数据。And when the read request of the file data is received, the interface of the distributed file system corresponding to the file data is invoked according to the mapping relationship between the path of the file data and the mount point path of the distributed file system, and The interface reads the file data in the corresponding distributed file system.
  10. 根据权利要求9所述的装置,还包括:The apparatus of claim 9 further comprising:
    数据加密和解密模块,设置为当接收到文件数据的写请求时,对需要写入的文件数据进行加密;以及当接收到文件数据的读请求时,对需要读出的经过加密的文件数据进行解密。a data encryption and decryption module configured to encrypt file data to be written when a write request for file data is received; and to perform encrypted read file data to be read when receiving a read request for file data Decrypt.
  11. 根据权利要求7所述的装置,其中,所述数据分发模块还设置为:The apparatus of claim 7, wherein the data distribution module is further configured to:
    根据聚合的所有分布式文件系统的配置信息,确定除所述文件数据写入对应的分布式文件系统之外的其他分布式文件系统;以及Determining, according to configuration information of all distributed file systems of the aggregation, a distributed file system other than the file data written to the corresponding distributed file system;
    调用所述其他分布式文件系统的接口,将写入对应的分布式文件系统的文件数据远程备份到所述其他分布式文件系统中。The interface of the other distributed file system is invoked to remotely back up file data written to the corresponding distributed file system to the other distributed file system.
  12. 根据权利要求11所述的装置,其中,所述数据分发模块还设置为:The apparatus of claim 11 wherein said data distribution module is further configured to:
    当接收到文件数据的删除请求时,根据所述文件数据的路径和分布式文件系统的挂载点路径的映射关系,调用和所述文件数据对应的分布式文件系统的接口,从所述对应的分布式文件系统中删除所述文件数据;以及When receiving the deletion request of the file data, the interface of the distributed file system corresponding to the file data is invoked according to the mapping relationship between the path of the file data and the mount point path of the distributed file system, from the corresponding The file data is deleted from the distributed file system;
    如果所述文件数据在所述其他分布式系统中有远程备份,则从远程备份的所述其他分布式文件系统中删除所述文件数据。If the file data has a remote backup in the other distributed system, the file data is deleted from the other distributed file system of the remote backup.
  13. 一种计算机可读存储介质,存储有计算机可执行指令,所述计算机可执行指令设置为执行权利要求1-6中任一项的方法。 A computer readable storage medium storing computer executable instructions arranged to perform the method of any of claims 1-6.
PCT/CN2017/085338 2016-05-30 2017-05-22 Storage method and storage device for distributed file system WO2017206754A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610371376.2 2016-05-30
CN201610371376.2A CN107451138A (en) 2016-05-30 2016-05-30 A kind of distributed file system storage method and system

Publications (1)

Publication Number Publication Date
WO2017206754A1 true WO2017206754A1 (en) 2017-12-07

Family

ID=60478475

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/085338 WO2017206754A1 (en) 2016-05-30 2017-05-22 Storage method and storage device for distributed file system

Country Status (2)

Country Link
CN (1) CN107451138A (en)
WO (1) WO2017206754A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109408270A (en) * 2018-10-18 2019-03-01 郑州云海信息技术有限公司 A kind of processing method and processing device of read-write operation
CN111444157A (en) * 2019-01-16 2020-07-24 阿里巴巴集团控股有限公司 Distributed file system and data access method
CN112506434A (en) * 2020-12-11 2021-03-16 杭州安恒信息技术股份有限公司 Method and related device for reading and writing data in web micro-service cluster

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108875035B (en) * 2018-06-25 2022-02-18 郑州云海信息技术有限公司 Data storage method of distributed file system and related equipment
CN109033869A (en) * 2018-07-04 2018-12-18 深圳虚觅者科技有限公司 Encrypted file system hanging method and device
CN110795386B (en) * 2018-07-31 2022-07-01 杭州海康威视系统技术有限公司 Data writing method and server
CN111142777A (en) * 2018-11-03 2020-05-12 广州市明领信息科技有限公司 Big data storage system
CN110221990B (en) * 2019-04-26 2021-10-08 奇安信科技集团股份有限公司 Data storage method and device, storage medium and computer equipment
CN112733189A (en) * 2021-01-14 2021-04-30 浪潮云信息技术股份公司 System and method for realizing file storage server side encryption

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101800655A (en) * 2009-02-05 2010-08-11 李冰 Peer-to-peer service system establishing method for contributing resources to application of large-scale internet
CN102546664A (en) * 2012-02-27 2012-07-04 中国科学院计算技术研究所 User and authority management method and system for distributed file system
US20140304299A1 (en) * 2013-03-15 2014-10-09 Emc Corporation Data management in a multi-tenant distributive environment
CN104113597A (en) * 2014-07-18 2014-10-22 西安交通大学 Multi- data-centre hadoop distributed file system (HDFS) data read-write system and method
CN104281506A (en) * 2014-07-10 2015-01-14 中国科学院计算技术研究所 Data maintenance method and system for file system
CN105095294A (en) * 2014-05-15 2015-11-25 中兴通讯股份有限公司 Method and device for managing heterogeneous copy in distributed storage system

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6883029B2 (en) * 2001-02-14 2005-04-19 Hewlett-Packard Development Company, L.P. Separate read and write servers in a distributed file system
US7222231B2 (en) * 2001-04-19 2007-05-22 Hewlett-Packard Development Company, L.P. Data security for distributed file systems
US7280956B2 (en) * 2003-10-24 2007-10-09 Microsoft Corporation System, method, and computer program product for file encryption, decryption and transfer
US8667273B1 (en) * 2006-05-30 2014-03-04 Leif Olov Billstrom Intelligent file encryption and secure backup system
US20090222509A1 (en) * 2008-02-29 2009-09-03 Chao King System and Method for Sharing Storage Devices over a Network

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101800655A (en) * 2009-02-05 2010-08-11 李冰 Peer-to-peer service system establishing method for contributing resources to application of large-scale internet
CN102546664A (en) * 2012-02-27 2012-07-04 中国科学院计算技术研究所 User and authority management method and system for distributed file system
US20140304299A1 (en) * 2013-03-15 2014-10-09 Emc Corporation Data management in a multi-tenant distributive environment
CN105095294A (en) * 2014-05-15 2015-11-25 中兴通讯股份有限公司 Method and device for managing heterogeneous copy in distributed storage system
CN104281506A (en) * 2014-07-10 2015-01-14 中国科学院计算技术研究所 Data maintenance method and system for file system
CN104113597A (en) * 2014-07-18 2014-10-22 西安交通大学 Multi- data-centre hadoop distributed file system (HDFS) data read-write system and method

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109408270A (en) * 2018-10-18 2019-03-01 郑州云海信息技术有限公司 A kind of processing method and processing device of read-write operation
CN109408270B (en) * 2018-10-18 2021-12-03 郑州云海信息技术有限公司 Read-write operation processing method and device
CN111444157A (en) * 2019-01-16 2020-07-24 阿里巴巴集团控股有限公司 Distributed file system and data access method
CN111444157B (en) * 2019-01-16 2023-06-20 阿里巴巴集团控股有限公司 Distributed file system and data access method
CN112506434A (en) * 2020-12-11 2021-03-16 杭州安恒信息技术股份有限公司 Method and related device for reading and writing data in web micro-service cluster

Also Published As

Publication number Publication date
CN107451138A (en) 2017-12-08

Similar Documents

Publication Publication Date Title
WO2017206754A1 (en) Storage method and storage device for distributed file system
JP6336675B2 (en) System and method for aggregating information asset metadata from multiple heterogeneous data management systems
US10158483B1 (en) Systems and methods for efficiently and securely storing data in a distributed data storage system
US8495392B1 (en) Systems and methods for securely deduplicating data owned by multiple entities
US10102242B2 (en) Bulk initial download of mobile databases
US8745416B2 (en) Systems and methods for secure third-party data storage
US10509701B2 (en) Performing data backups using snapshots
US11068446B2 (en) Multi-cloud bi-directional storage replication system and techniques
US11194920B2 (en) File system metadata protection
US8429364B1 (en) Systems and methods for identifying the presence of sensitive data in backups
US8595493B2 (en) Multi-phase storage volume transformation
CN108763401A (en) A kind of reading/writing method and equipment of file
CN114036538A (en) Database transparent encryption and decryption implementation method and system based on virtual block device
US10628073B1 (en) Compression and encryption aware optimized data moving across a network
JP2009064055A (en) Computer system and security management method
WO2019214071A1 (en) Communication method for users on blockchain, device, terminal device, and storage medium
CN104268489A (en) Method for optimizing performance of encryption card based on DEVICE MAPPER
TW201234289A (en) Autonomous intelligent content items
US10354062B2 (en) System and method for simultaneous forensic, acquisition, examination and analysis of a computer readable medium at wire speed
US10277565B2 (en) Enterprise service bus logging
CN109660604B (en) Data access method and equipment
EP4350557A1 (en) Data storage method and apparatus, device, and readable storage medium
CN116094775A (en) Ceph distributed file system server encryption system
CN117592068A (en) Method, device, equipment and storage medium for converting encrypted data
CN115079960A (en) Data processing method, accelerator card and data processing system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17805699

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17805699

Country of ref document: EP

Kind code of ref document: A1