CN116955278A

CN116955278A - Aggregation access method and device for distributed file system snapshot and computer equipment

Info

Publication number: CN116955278A
Application number: CN202310974299.XA
Authority: CN
Inventors: 吴军疆; 刘卫乾; 姚娜; 何召展; 夏瑞虎; 邢迪
Original assignee: Orca Data Technology Xian Co Ltd
Current assignee: Orca Data Technology Xian Co Ltd
Priority date: 2023-08-03
Filing date: 2023-08-03
Publication date: 2023-10-27

Abstract

The application discloses an aggregation access method, a device, a computer device and a storage medium of distributed file system snapshots, wherein the method firstly uses the same service on a management server, a metadata server and a chunk server to simultaneously manage a data set corresponding to source data in a rear-end disk array of each server and a data set corresponding to a plurality of internal snapshots, then organizes the plurality of internal snapshots in a global unified naming space in a directory tree mode, finally enters an internal snapshot unified management directory in the global unified naming space, and under the management directory, acquires a snapshot view of a subdirectory corresponding to each access target snapshot to finish the access of the data sets corresponding to all internal snapshots in the global unified naming space. According to the method and the device for loading the snapshot data, a new instance is not required to be started, the complexity of software is reduced, and the source file and the snapshot file can be accessed through a unified naming space, so that the access is more convenient.

Description

Aggregation access method and device for distributed file system snapshot and computer equipment

Technical Field

The present application relates to the field of front-end development technologies, and in particular, to an aggregate access method and apparatus for distributed file system snapshots, and a computer device.

Background

In a distributed file system, snapshots allow the system to preserve the state of data at a particular point in time, which is important for protecting critical data. If data corruption, accidental deletion, or other problems occur, the snapshot may be used to revert to a previous state. The snapshot can be used as an efficient backup strategy, and only the data changed from the last snapshot is needed to be saved, instead of copying the whole file system, which can save storage space and bandwidth.

The distributed file system consists of a management server, a metadata server and a chunk server, wherein the metadata server stores metadata information of all files in the file system, and the chunk server stores data blocks of all files in the file system. The back end of each server is connected with one or more disk arrays. In the distributed file system, internal snapshots can be respectively created at the same time for all disk arrays used on management servers, metadata servers and chunk servers, and snapshots of the disk arrays used on the servers are respectively created on all servers, so that the snapshots of the same time on all servers together form the snapshot of the whole distributed file system.

In the existing distributed file system, a directory tree is used for maintaining data in the whole file system, the current version data on all servers form a directory tree of the current version, service examples of different versions are started on each server, snapshot data of corresponding versions are loaded, directory trees of different corresponding versions are formed, and access to snapshot data of specific versions can be realized after the snapshot data are mounted. However, when the existing distributed file system accesses snapshot data of a specific version, since snapshot data of different versions correspond to different directory trees, loading of the snapshot data needs to start a new service instance, and as the number of servers and the number of snapshots increase, loading complexity and resource consumption increase accordingly.

Disclosure of Invention

Based on the above, it is necessary to provide an aggregate access method, an aggregate access device and a computer device for distributed file system snapshots, so that the snapshot data is loaded without starting a new instance, the software complexity is reduced, and the source file and the snapshot file can be accessed through a unified naming space, thereby facilitating access.

In a first aspect, the present application provides an aggregate access method for distributed file system snapshots, where the method includes:

the method comprises the steps that a management server, a metadata server and a chunk server are all used for simultaneously managing a data set corresponding to source data in a disk array at the rear end of each server and a data set corresponding to a plurality of internal snapshots;

aggregating the plurality of internal snapshots in a directory tree organization into a global unified namespace;

and entering an internal snapshot unified management directory in the global unified naming space, and under the management directory, obtaining a snapshot view of the subdirectory corresponding to each access target snapshot to finish the access of all data sets corresponding to the internal snapshots in the global unified naming space.

In one embodiment, aggregating the plurality of internal snapshots organized in a directory tree fashion in a global unified namespace includes:

adding a snapshot preset total entry in the global unified name space;

creating a corresponding snapshot entry for each internal snapshot within a snapshot preset total entry;

constructing a directory tree of each internal snapshot;

and connecting the snapshot preset total entry node with the root node of the directory tree corresponding to the internal snapshot.

In one embodiment, obtaining a snapshot view of the subdirectory corresponding to each access target snapshot includes:

accessing a subdirectory corresponding to the target snapshot under the management directory;

acquiring a target snapshot identification of a subdirectory corresponding to the access target snapshot;

the server receives a data operation request of the client for the target snapshot identifier;

the server side stores and reads the sub-directory data information corresponding to the access target snapshot and the data information of the target snapshot identification from the rear end and returns the data information of the target snapshot identification to the client side;

and traversing all files under the sub-directory corresponding to the access target snapshot by the client to generate an access target snapshot view.

In one embodiment, the snapshot identification consists of a snapshot tag and a snapshot name.

In a second aspect, the present application also provides an aggregate access device for distributed file system snapshots, where the device includes:

the management module is used for simultaneously managing the data sets corresponding to the source data in the disk arrays at the rear end of each server and the data sets corresponding to a plurality of internal snapshots by using the same service on the management server, the metadata server and the chunk server;

the aggregation module is used for organizing and aggregating a plurality of internal snapshots in a directory tree form into a global unified naming space;

and the access module is used for entering an internal snapshot unified management directory in the global unified naming space, acquiring snapshot views of subdirectories corresponding to each access target snapshot under the management directory, and completing access of all data sets corresponding to the internal snapshots in the global unified naming space.

In a third aspect, the present application also provides a computer device. The computer device comprises a memory storing a computer program and a processor which when executing the computer program performs the steps of:

The application has the beneficial effects that:

(1) According to the application, the same service is used for simultaneously managing different versions of data in the disk arrays at the rear end of each server, the server does not need to load the different versions of data by starting a plurality of service examples, the implementation and maintenance difficulty of the snapshot function is reduced, and the reliability is enhanced.

(2) According to the application, the internal snapshots of all versions are aggregated in the global unified name space, so that the complexity of accessing the snapshots of different versions is reduced, and the availability of snapshot functions is enhanced.

Drawings

FIG. 1 is a schematic flow diagram of an aggregate access method for distributed file system snapshots according to an embodiment of the present application;

FIG. 2 is a flowchart illustrating another method for aggregate access of distributed file system snapshots according to an embodiment of the application;

FIG. 3 is a flowchart illustrating another method for aggregate access of distributed file system snapshots according to an embodiment of the application;

FIG. 4 is a schematic diagram of a framework for managing multiple versions of data for a single service provided by an embodiment of the present application;

FIG. 5 is a schematic diagram of a total directory tree structure provided by an embodiment of the present application;

FIG. 6 is a diagram of data maintenance in a globally uniform namespace in accordance with an embodiment of the present application;

FIG. 7 is a diagram of the internal architecture of a computing device in one embodiment of the application.

Detailed Description

The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.

The embodiment of the application provides an aggregation access method, an aggregation access device, computer equipment and a storage medium for distributed file system snapshots, aiming at reducing snapshot access complexity. The following describes the technical scheme of the present application and how the technical scheme of the present application solves the above technical problems in detail by examples and with reference to the accompanying drawings. The following embodiments may be combined with each other, and the same or similar concepts or processes may not be described in detail in some embodiments.

It should be noted that, in the method for aggregating and accessing distributed file system snapshots provided by the embodiment of the present application, the execution body may be an aggregate access device of the distributed file system snapshots, where the aggregate access device of the distributed file system snapshots may be implemented by software, hardware, or a combination of software and hardware into part or all of a computer device, and the computer device may be a server (or a terminal) for aggregating and accessing the distributed file system snapshots. In the following method embodiments, the execution subject is a computer device. It can be understood that the aggregate access method of the distributed file system snapshot provided in the method embodiment described below may also be applied to a system including a terminal and a server, and implemented through interaction between the terminal and the server.

In one embodiment, as shown in fig. 1, fig. 1 is one of the flow diagrams of the method for aggregate access of distributed file system snapshots according to the embodiment of the present application, where the method is applied to a computer device, and includes the following steps:

and S101, simultaneously managing a data set corresponding to source data in a disk array at the back end of each server and a data set corresponding to a plurality of internal snapshots by using the same service on the management server, the metadata server and the chunk server.

The step is that in the distributed storage cluster node server, a single service manages a plurality of version data on the back-end disk array at the same time, and conditions are created for constructing a global unified naming space. The data of a plurality of versions of snapshots are loaded through a single service instance, the snapshot data are mapped into the original directory space through data mapping, and the source file and the snapshot file can be accessed through a unified naming space. The data sets corresponding to the source data and the data sets corresponding to the plurality of internal snapshots are data sets of different versions.

S102, organizing and aggregating a plurality of internal snapshots in a directory tree form into a global unified name space. All versions of the snapshot are in a global unified namespace, reducing the complexity of accessing different versions of the snapshot.

S103, entering an internal snapshot unified management directory in the global unified naming space, and under the management directory, obtaining snapshot views of subdirectories corresponding to each access target snapshot to finish access of data sets corresponding to all internal snapshots in the global unified naming space.

The data in each snapshot corresponds to the directory tree of the corresponding snapshot version, all files and data of the snapshot are maintained through the directory tree, and file data of the specified version is found by addressing and locating the directory tree. The internal snapshot unified management directory in the global unified name space is used for managing the data set corresponding to the source data and the data sets corresponding to the plurality of internal snapshots.

In one embodiment, as shown in fig. 2, this embodiment relates to how a plurality of internal snapshots are organized in a directory tree form into a global unified namespace, where step S102 includes:

s201, adding a snapshot preset total entry in the global unified name space.

S202, creating a corresponding snapshot entry for each internal snapshot in a snapshot preset total entry. Each internal snapshot corresponds to a snapshot entry with a snapshot tag.

S203, constructing a directory tree of each internal snapshot.

S204, connecting the snapshot preset total entry node with the root node of the directory tree corresponding to the internal snapshot.

Specifically, the data of a single snapshot entry only corresponds to the root node of the snapshot version directory tree, and all the data of the snapshot version can be accessed from the root node of the corresponding version after entering. By connecting the preset total entry node of the snapshot with the root node of the directory tree corresponding to the snapshot version data, all the directory trees of the snapshot are linked together under the current directory tree. By adopting the method of the embodiment, all the version snapshot files can be accessed in one unified naming space, the access is more convenient, and the complexity of snapshot management and use is reduced.

In one embodiment, as shown in fig. 3, this embodiment relates to how to obtain a snapshot view of a subdirectory corresponding to each access target snapshot, where, based on the above embodiment, step S103 includes:

s301, accessing a subdirectory corresponding to the target snapshot under the management directory. The management directory contains all subdirectories with the same name as the snapshot, one for each snapshot.

S302, obtaining a target snapshot identification of a subdirectory corresponding to the access target snapshot. Indicating that the data to be accessed is from the snapshot of the target snapshot.

Specifically, the snapshot identifier consists of a snapshot flag and a snapshot name.

S303, the server receives a data operation request of the client for the target snapshot identifier.

S304, the server side reads the sub-directory data information corresponding to the target snapshot and the data information of the target snapshot identification from the back end storage, and returns the data information of the target snapshot identification to the client side.

S305, the client traverses all files under the sub-directory corresponding to the access target snapshot to generate an access target snapshot view.

Specifically, the snapshot access is completed by the client, the client selects a snapshot name directory to be accessed, then the client calculates a corresponding snapshot unique identifier according to the selected directory name, a secondary mark is attached to a subsequent file operation request of a specific snapshot, then the file operation request with the snapshot unique identifier is sent to the metadata server, after receiving the request, the metadata server converts the snapshot mark attached to the request into the attribution position of snapshot data in the disk array, and reads the file data belonging to the corresponding snapshot in the disk array and returns the file data.

The snapshot access method in the embodiment is completed under one directory tree, and the snapshot view creating process is simple.

In a specific embodiment, internal snapshots are respectively created by using disk arrays used on a management server, a metadata server and a chunk server at the same time, wherein the internal snapshots comprise original, snapshot-1 and snapshot-2, the snapshots of the disk arrays used on the servers are respectively created on all servers, the snapshot-1 and the snapshot-2 are respectively created on the disk arrays used on the servers, the snapshot-1 of the distributed file system is formed by all the sets of the snapshot-1, and the aggregate access method of the distributed file system snapshot is described by taking the snapshot-2 of the distributed file system as an example. The method of the embodiment specifically comprises the following steps:

(1) As shown in fig. 4, fig. 4 is a schematic diagram of a framework for managing multiple versions of data by a single service according to an embodiment of the present application. And simultaneously managing the data set corresponding to original and the data sets corresponding to the snapshot-1 and the snapshot-2 in the back-end disk arrays of each server by using the same service on the management server, the metadata server and the chunk server.

(2) Snapshot preset total entries are added in the global unified name space.

(3) And creating snapshot entries, namely snapshot-1 and snapshot-2, in the snapshot preset total entries.

(4) A directory tree is built for each internal snapshot.

(5) The nodes, snapshonts, are connected with the root nodes of the directory trees of snapshot entries snapshont-1 and snapshont-2 to form an overall directory tree. The total directory tree is shown in fig. 5. For convenience of presentation, in FIGS. 4 and 5 in this embodiment, snapshot-1 and snapshot-2 are written as snap-1 and snap-2.

In this embodiment, the data in the global unified namespace is maintained through the global directory tree. FIG. 6 is a diagram of data maintenance in a globally unified namespace, in accordance with an embodiment of the present application, as shown in FIG. 6. Using snapshot markers to implement the distinction between snapshot files and normal files, different versions of snapshot files: each file is provided with two snapshot-related attributes, NONE/SNAP_ROOT/SNAP_TOP/SNAP_COMM, for marking whether the file is a snapshot. Wherein each attribute is represented by NONE: a non-snapshot file; snap_root: entries for all snapshots; special for snapshots directory; snap_top: root nodes of all snapshot directory trees; snap_comm: a common snapshot file.

And the snapshot name is used for locating the snapshot of the file, and the service locates the position of the data in the server disk array according to the name, such as snapshot-1 and snapshot-2. The snapshot mark and the snapshot name form a common mark, such as dir-1, the snapshot mark is snap_comm, the snapshot name is snapshot-1, the file is a common snapshot file, metadata and data of the file are in the snapshot-1, and the metadata server performs data searching and returning on dir-1 in the snapshot-1 area.

(6) And entering a snapshots management directory, and under the management directory, obtaining snapshot views of subdirectories corresponding to snapshot-1 and snapshot-2, thereby completing access of data sets corresponding to all internal snapshots in the global unified naming space.

Specifically, the embodiment only illustrates that the snapshot view of the subdirectory corresponding to the snapshot-1 is obtained, and the process of obtaining the snapshot view of the subdirectory corresponding to the snapshot-2 is the same as that of obtaining the snapshot view of the subdirectory corresponding to the snapshot-1, and the snapshot view of the subdirectory corresponding to the snapshot-1 is not described herein, and includes:

(6-1) entering a subdirectory corresponding to the snapshot-1 under the snapshot management directory.

(6-2) obtaining a target snapshot identifier dir-1 of the subdirectory corresponding to the snapshot-1.

And (6-3) the server receives a data operation request of the client for the target snapshot identifier dir-1, determines the position of the read data according to the snapshot mark attached in the request, reads the information of dir-1 from the non-snapshot position if the snapshot mark is not available, and reads the information of dir-1 corresponding to the specified snapshot identifier from the snapshot data area if the snapshot mark is available.

(6-4) after the service reads the data information of dir-1 in the snapshot-1 from the back-end storage, the dir-1 data is returned to the client.

And (6-5) continuing to traverse all files under the snapshot-1 once by the client, namely constructing all files belonging to the snapshot-1 in the global namespace, and completing the generation of the snapshot view of the snapshot-1.

According to the aggregation access method of the distributed file system snapshots, all versions of internal snapshots are aggregated in the global unified naming space, and complexity of accessing snapshots of different versions is reduced.

Based on the same inventive concept, the embodiment of the application also provides an aggregation access device for realizing the aggregation access of the related distributed file system snapshots. The implementation scheme of the solution provided by the device is similar to the implementation scheme described in the above method, so the specific limitation in the embodiment of the aggregate access device for the distributed file system snapshot provided below may be referred to the limitation of aggregate access for the distributed file system snapshot hereinabove, and will not be described herein.

In one embodiment, there is provided an e-commerce platform construction apparatus, the apparatus comprising:

In one embodiment, a computer device is provided, which may be a server, the internal structure of which may be as shown in fig. 7. The computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The database of the computer device is for storing aggregated access data for distributed file system snapshots. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program, when executed by a processor, implements an aggregate access method for distributed file system snapshots.

It will be appreciated by those skilled in the art that the structure shown in fig. 7 is merely a block diagram of a portion of the structure associated with the present inventive arrangements and is not limiting of the computer device to which the present inventive arrangements are applied, and that a particular computer device may include more or fewer components than shown in fig. 7, or may combine certain components, or have a different arrangement of components.

In an alternative embodiment, a computer device is provided. The computer device comprises a memory storing a computer program and a processor which when executing the computer program performs the steps of:

Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, database, or other medium used in embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high density embedded nonvolatile Memory, resistive random access Memory (ReRAM), magnetic random access Memory (Magnetoresistive Random Access Memory, MRAM), ferroelectric Memory (Ferroelectric Random Access Memory, FRAM), phase change Memory (Phase Change Memory, PCM), graphene Memory, and the like. Volatile memory can include random access memory (Random Access Memory, RAM) or external cache memory, and the like. By way of illustration, and not limitation, RAM can be in the form of a variety of forms, such as static random access memory (Static Random Access Memory, SRAM) or dynamic random access memory (Dynamic Random Access Memory, DRAM), and the like. The databases referred to in the embodiments provided herein may include at least one of a relational database and a non-relational database. The non-relational database may include, but is not limited to, a blockchain-based distributed database, and the like. The processor referred to in the embodiments provided in the present application may be a general-purpose processor, a central processing unit, a graphics processor, a digital signal processor, a programmable logic unit, a data processing logic unit based on quantum computing, or the like, but is not limited thereto.

The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.

The foregoing examples illustrate only a few embodiments of the application and are described in detail herein without thereby limiting the scope of the application. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the application, which are all within the scope of the application. Accordingly, the scope of the application should be assessed as that of the appended claims.

Claims

1. An aggregate access method for distributed file system snapshots, the method comprising:

and entering an internal snapshot unified management directory in the global unified naming space, and under the management directory, obtaining a snapshot view of the subdirectory corresponding to each access target snapshot to finish the access of the data sets corresponding to all the internal snapshots in the global unified naming space.

2. The method of aggregate access of distributed file system snapshots of claim 1, wherein aggregating a plurality of internal snapshots organized in a directory tree into a global unified namespace comprises:

adding a snapshot preset total entry in the global unified name space;

constructing a directory tree of each internal snapshot;

3. The method of aggregate access of distributed file system snapshots of claim 2, wherein obtaining a snapshot view of a subdirectory to which each access target snapshot corresponds comprises:

the server side stores and reads the sub-directory data information corresponding to the access target snapshot and the data information of the target snapshot identification from the back end, and returns the data information of the target snapshot identification to the client side;

4. The method of aggregate access to distributed file system snapshots of claim 3, wherein the snapshot identification consists of a snapshot tag and a snapshot name.

5. An aggregate access device for distributed file system snapshots, the device comprising:

and the access module is used for entering an internal snapshot unified management directory in the global unified naming space, and under the management directory, obtaining a snapshot view of the subdirectory corresponding to each access target snapshot to finish the access of the data sets corresponding to all the internal snapshots in the global unified naming space.

6. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor when executing the computer program performs the steps of: