CN110209351B

CN110209351B - Distributed storage data processing method and device

Info

Publication number: CN110209351B
Application number: CN201910390851.4A
Authority: CN
Inventors: 肖永玲; 王豪迈; 胥昕
Original assignee: Xsky Beijing Data Technology Corp ltd
Current assignee: Beijing Xingchen Tianhe Technology Co ltd
Priority date: 2019-05-10
Filing date: 2019-05-10
Publication date: 2021-02-19
Anticipated expiration: 2039-05-10
Also published as: CN110209351A

Abstract

The invention discloses a distributed storage data processing method and device. Wherein, the method comprises the following steps: under the condition of receiving a data operation request from a target client, detecting whether a snapshot module is configured in a distributed storage system, wherein the data operation request at least comprises: a data write request and a data read request; determining an operation mode corresponding to the data operation request according to the detection result; and returning the data operation result to the target client under the condition of successful operation. The invention solves the technical problem that the data processing method of the distributed storage system in the prior art has larger storage space occupied by the snapshot.

Description

Distributed storage data processing method and device

Technical Field

The invention relates to the field of data processing, in particular to a distributed storage data processing method and device.

Background

At present, a snapshot implemented by a distributed storage system CHPH is copy-on-write COW, and after a time point snapshot is taken, every time an IO is written, corresponding object data 4M is copied and written into a new object, and then the new data is written into a corresponding position.

However, although the current distributed storage system CHPH implements the snapshot function, there are two unresolved problems: 1) each IO write involves three IO operations (reading the original object data once, writing a new object once, and writing a new IO once), which seriously causes the volume performance to be seriously reduced, even by more than half according to the actual test condition; 2) each time a new IO is written, an object of 4M data needs to be copied, and a 4M space needs to be occupied only when a small amount of data is updated, so that a snapshot occupies a large storage space, which greatly wastes the storage space.

In view of the above problems, no effective solution has been proposed.

Disclosure of Invention

The embodiment of the invention provides a distributed storage data processing method and a distributed storage data processing device, which are used for at least solving the technical problem that a data processing method of a distributed storage system in the prior art occupies a large storage space due to snapshots.

According to an aspect of an embodiment of the present invention, there is provided a distributed storage data processing method, including: under the condition of receiving a data operation request from a target client, detecting whether a snapshot module is configured in a distributed storage system, wherein the data operation request at least comprises: a data write request and a data read request; determining an operation mode corresponding to the data operation request according to the detection result; and returning the data operation result to the target client under the condition of successful operation.

Further, detecting whether a snapshot module is configured in the distributed storage system includes: positioning a target storage module in the distributed storage system according to a target data distribution algorithm and the data operation request, wherein the target storage module is used for maintaining a metadata base of the distributed storage system; and detecting whether the target storage module is provided with the snapshot module or not to obtain the detection result.

Further, determining an operation mode corresponding to the data operation request according to the detection result includes: if the detection result is that the snapshot module is configured in the distributed storage system, writing a snapshot object and a new object in the distributed storage system into a metadata base, and writing data to be written into the new object based on a target distribution granularity, wherein the new object is a newly-built object for storing the data to be written; and if the detection result indicates that the snapshot module is not configured in the distributed storage system, writing the data to be written into an original storage object corresponding to the data to be written, wherein the first identification code of the snapshot object is the same as the second identification code of the original storage object.

Further, before writing the snapshot object and the new object in the distributed storage system into the metadata base, the method further includes: renaming the original storage object in the distributed storage system as the snapshot object; and determining a new object name of the new object according to the original name of the original storage object, wherein the metadata base stores a first mapping relation between the first identification code and the new object name and a second mapping relation between the second identification code and the new object name.

Further, when the data operation request is a data write request, returning a data operation result to the target client includes: and returning the first identification code of the snapshot object and the second identification code of the newly-built object to the target client.

Further, the value range of the target allocation granularity is at least 4KB to 4 MB.

Further, in a case that the data operation request is a data read request, determining an operation manner corresponding to the data operation request according to a detection result, including: if the detection result indicates that the snapshot module is configured in the distributed storage system, locating a target disk location in a metadata base according to identification code information carried in the data reading request, and reading target reading data corresponding to the data reading request from the target disk location, wherein the identification code information at least includes: the first identification code of the snapshot object and the second identification code of the newly-built object; and if the detection result indicates that the snapshot module is not configured in the distributed storage system, positioning the target disk position according to a third identification code of an original storage object in the distributed storage system, and reading the target read data from the target disk position.

Further, when the data operation request is a data read request, returning a data operation result to the target client includes: and returning and reading the target reading data to the target client.

According to another aspect of the embodiments of the present invention, there is also provided a distributed storage data processing apparatus, including: a detection module, configured to detect whether a snapshot module is configured in a distributed storage system under a condition that a data operation request from a target client is received, where the data operation request at least includes: a data write request and a data read request; the determining module is used for determining an operation mode corresponding to the data operation request according to the detection result; and the return module is used for returning the data operation result to the target client under the condition of successful operation.

According to another aspect of the embodiments of the present invention, there is also provided a storage medium, where the storage medium includes a stored program, and when the program runs, the apparatus on which the storage medium is located is controlled to execute any one of the above-mentioned distributed storage data processing methods.

According to another aspect of the embodiments of the present invention, there is also provided a processor, configured to execute a program, where the program executes any one of the above methods for processing distributed storage data.

In the embodiment of the present invention, it is detected whether a snapshot module is configured in a distributed storage system by receiving a data operation request from a target client, where the data operation request at least includes: a data write request and a data read request; determining an operation mode corresponding to the data operation request according to the detection result; and under the condition of successful operation, returning a data operation result to the target client, so as to achieve the purpose of reducing the storage space occupied by the snapshot, thereby realizing the technical effect of improving the utilization rate of the storage space, and further solving the technical problem that the data processing method of the distributed storage system in the prior art has larger storage space occupied by the snapshot.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:

FIG. 1 is a flow diagram of a distributed storage data processing method according to an embodiment of the invention;

FIG. 2 is a flow diagram of an alternative distributed storage data processing method according to an embodiment of the invention;

FIG. 3 is a flow diagram of an alternative distributed storage data processing method according to an embodiment of the invention; and

fig. 4 is a schematic structural diagram of a distributed storage data processing apparatus according to an embodiment of the present invention.

Detailed Description

In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

First, in order to facilitate understanding of the embodiments of the present invention, some terms or nouns referred to in the present invention will be explained as follows:

Copy-on-Write (COW) when a block of protected entry is to be overwritten, the block is first copied elsewhere (i.e., to a location specified by the snapshot system) and then overwritten in its original location (i.e., the protected entry's storage location).

Redirect-on-Write (ROW): the Row snapshot uses pointers to all blocks of protected entry, and if a block is to be overwritten, the storage system points the pointer to the block to a new location, and then writes the new data to the new location.

Snapshot: with respect to a fully available copy of a given data set, the copy includes an image of the corresponding data at some point in time (the point in time at which the copy begins). The snapshot may be a copy of the data it represents or may be a replica of the data. The way snapshot data is stored is equivalent to a camera, the data at the point in time is equivalent to a negative film, and the snapshot view is used to present the state of the data at that time, equivalent to a washed out photograph.

Example 1

In accordance with an embodiment of the present invention, there is provided an embodiment of a distributed storage data processing method, it should be noted that the steps illustrated in the flowchart of the figure may be performed in a computer system such as a set of computer executable instructions, and that while a logical order is illustrated in the flowchart, in some cases the steps illustrated or described may be performed in an order different than that herein.

Fig. 1 is a flowchart of a distributed storage data processing method according to an embodiment of the present invention, as shown in fig. 1, the method includes the following steps:

step S102, detecting whether a snapshot module is configured in the distributed storage system when a data operation request from a target client is received, where the data operation request at least includes: a data write request and a data read request;

step S104, determining an operation mode corresponding to the data operation request according to the detection result;

and step S106, returning the data operation result to the target client under the condition that the operation is successful.

Optionally, the data operation request at least includes: data write requests and data read requests.

Through the steps S102 to S106, the embodiment of the present application may implement a scheme of configuring the ROW snapshot, and the performance loss in the process of implementing data processing is very small, so that the problem of performance sharp decrease caused by copying before writing each time can be solved, and the average performance loss value is less than 5%; moreover, the storage space occupied by the snapshot is greatly reduced, and if the data variation is smaller, the storage space occupied by the snapshot is smaller.

In an alternative embodiment, detecting whether a snapshot module is configured in the distributed storage system includes:

step S202, positioning a target storage module in the distributed storage system according to a target data distribution algorithm and the data operation request, wherein the target storage module is used for maintaining a metadata base of the distributed storage system;

step S204, detecting whether the target storage module is configured with the snapshot module, and obtaining the detection result.

Optionally, the target data distribution algorithm at least includes: and the target storage module is used for maintaining a metadata base of the distributed storage system.

In an optional embodiment, after the type of the data operation request is determined, a data distribution algorithm CRUSH may be used to perform calculation, locate a target storage module OSD in the distributed storage system, detect whether the snapshot module is configured in the target storage module, and determine an operation mode corresponding to the data operation request according to a detection result of whether the snapshot module is configured in the target storage module.

In an optional embodiment, determining, according to the detection result, an operation manner corresponding to the data operation request includes:

step S302, if the detection result indicates that the snapshot module is configured in the distributed storage system, writing a snapshot object and a new object in the distributed storage system into a metadata base, and writing data to be written into the new object based on a target distribution granularity, where the new object is a newly created object for storing the data to be written;

step S304, if the detection result indicates that the snapshot module is not configured in the distributed storage system, writing the data to be written into an original storage object corresponding to the data to be written, where a first identification code of the snapshot object is the same as a second identification code of the original storage object.

In an alternative embodiment, the target allocation granularity is at least 4KB to 4 MB.

In an optional embodiment, before writing the snapshot object and the new object in the distributed storage system into the metadata base, the method further includes:

step S402, renaming the original storage object in the distributed storage system as the snapshot object;

step S404, determining a new object name of the new object according to the original name of the original storage object, wherein the metadata base stores a first mapping relationship between the first identifier and the new object name, and a second mapping relationship between the second identifier and the new object name.

In the embodiment of the application, an original storage object where original data is located is named as a snapshot object through a renaming mode, data does not need to be copied to a new object again, a new object name of the new object is determined according to the original name of the original storage object, namely, the newly created new object adopts the original storage object name; the first identification code of the snapshot object is the same as the second identification code of the original storage object, that is, all snapshot objects and original storage objects have the same object ID, and can be mapped to the same storage module OSD in the distributed storage system, and since the target allocation granularity is from the minimum storage space of 4KB, it is not necessary to directly allocate according to the allocation granularity of 4MB, which greatly reduces the storage space occupied by the snapshot.

In this embodiment of the present application, a snapshot object and an original storage object are both managed in a metadata database, and since the metadata database stores a first mapping relationship between the first identifier and a new object name and a second mapping relationship between the second identifier and the new object name, an object corresponding to an accessed IO and a disk location where the object corresponding to the accessed IO is located can be found according to the first identifier ID of the snapshot object and the second identifier ID of the original storage object.

In an optional embodiment, in a case that the data operation request is a data write request, returning a data operation result to the target client includes: and returning the first identification code of the snapshot object and the second identification code of the newly-built object to the target client.

In the above optional embodiment, by returning the first identifier of the snapshot object and the second identifier of the new object to the target client, it may be convenient for the target client to access and write or read a new IO when the target client writes or reads the IO.

In an optional embodiment, in a case that the data operation request is a data read request, determining an operation manner corresponding to the data operation request according to a detection result includes:

step S502, if the detection result indicates that the snapshot module is configured in the distributed storage system, locating a target disk location in a metadata base according to identification code information carried in the data reading request, and reading target reading data corresponding to the data reading request from the target disk location, where the identification code information at least includes: the first identification code of the snapshot object and the second identification code of the newly-built object;

step S504, if the detection result indicates that the snapshot module is not configured in the distributed storage system, positioning the target disk position according to the third identifier of the original storage object in the distributed storage system, and reading the target read data from the target disk position.

In an optional embodiment, in a case that the data operation request is a data read request, returning a data operation result to the target client includes: and returning and reading the target reading data to the target client.

In the above optional embodiment, a data distribution algorithm CRUSH is used for calculating, locating a target storage module OSD in the distributed storage system, detecting whether the target storage module is configured with the snapshot module, and determining an operation mode corresponding to the data operation request according to a detection result of whether the target storage module is configured with the snapshot module.

Specifically, in the above optional embodiment, if the detection result indicates that the snapshot module is configured in the distributed storage system, the target disk location in the metadata base is located according to the identification code information carried in the data reading request, and target read data corresponding to the data reading request is read from the target disk location, and if the detection result indicates that the snapshot module is not configured in the distributed storage system, the target disk location is located according to a third identification code of an original storage object in the distributed storage system, and the target read data is read from the target disk location.

Fig. 2 is a flowchart of another alternative distributed storage data processing method according to an embodiment of the present invention, as shown in fig. 2, the method includes the following steps:

step S602, under the condition of receiving a data writing request from a target client, positioning a target storage module in the distributed storage system according to a target data distribution algorithm and the data writing request;

step S604, detecting whether a snapshot module is configured in the target storage module to obtain a detection result;

step S606, if the detection result is that the snapshot module is configured in the distributed storage system, writing the snapshot object and the newly-built object in the distributed storage system into a metadata base, and writing the data to be written into the newly-built object based on the target distribution granularity;

step S608, if the detection result indicates that the distributed storage system is not configured with the snapshot module, writing the data to be written into the original storage object corresponding to the data to be written, where a first identification code of the snapshot object is the same as a second identification code of the original storage object;

step S610, returning the first identification code of the snapshot object and the second identification code of the new object to the target client.

Fig. 3 is a flowchart of another alternative distributed storage data processing method according to an embodiment of the present invention, as shown in fig. 3, the method includes the following steps:

step S702, under the condition of receiving a data reading request from a target client, positioning a target storage module in the distributed storage system according to a target data distribution algorithm and the data reading request;

step S704, detecting whether a snapshot module is configured in the target storage module to obtain a detection result;

step S706, if the detection result is that the snapshot module is configured in the distributed storage system, locating the target disk position in the metadata base according to the identification code information carried in the data reading request, and reading target reading data corresponding to the data reading request from the target disk position, where the identification code information at least includes: the first identification code of the snapshot object and the second identification code of the newly-built object;

step S708, if the detection result is that the snapshot module is not configured in the distributed storage system, positioning the target disk position according to the third identification code of the original storage object in the distributed storage system, and reading target reading data from the target disk position;

step S710, returning the read target read data to the target client.

Example 2

According to an embodiment of the present invention, an embodiment of an apparatus for implementing the distributed storage data processing method is further provided, and fig. 4 is a schematic structural diagram of a distributed storage data processing apparatus according to an embodiment of the present invention, as shown in fig. 4, the distributed storage data processing apparatus includes: a detection module 40, a determination module 42, and a return module 44, wherein:

a detection module 40, configured to detect whether a snapshot module is configured in the distributed storage system when a data operation request from a target client is received, where the data operation request at least includes: a data write request and a data read request; a determining module 42, configured to determine, according to the detection result, an operation mode corresponding to the data operation request; and the returning module 44 is configured to return a data operation result to the target client if the operation is successful.

It should be noted that the above modules may be implemented by software or hardware, for example, for the latter, the following may be implemented: the modules can be located in the same processor; alternatively, the modules may be located in different processors in any combination.

It should be noted here that the detection module 40, the determination module 42 and the return module 44 correspond to steps S102 to S106 in embodiment 1, and the modules are the same as the examples and application scenarios realized by the corresponding steps, but are not limited to the disclosure of embodiment 1. It should be noted that the modules described above may be implemented in a computer terminal as part of an apparatus.

It should be noted that, reference may be made to the relevant description in embodiment 1 for alternative or preferred embodiments of this embodiment, and details are not described here again.

The above-mentioned distributed storage data processing apparatus may further include a processor and a memory, and the above-mentioned detection module 40, determination module 42, and return module 44, etc. are all stored in the memory as program units, and the processor executes the above-mentioned program units stored in the memory to implement the corresponding functions.

The processor comprises a kernel, and the kernel calls a corresponding program unit from the memory, wherein one or more than one kernel can be arranged. The memory may include volatile memory in a computer readable medium, Random Access Memory (RAM) and/or nonvolatile memory such as Read Only Memory (ROM) or flash memory (flash RAM), and the memory includes at least one memory chip.

According to the embodiment of the application, the embodiment of the storage medium is also provided. Optionally, in this embodiment, the storage medium includes a stored program, and when the program runs, the apparatus on which the storage medium is located is controlled to execute any one of the distributed storage data processing methods.

Optionally, in this embodiment, the storage medium may be located in any one of a group of computer terminals in a computer network, or in any one of a group of mobile terminals, and the storage medium includes a stored program.

Optionally, the program controls the device on which the storage medium is located to perform the following functions when running: under the condition of receiving a data operation request from a target client, detecting whether a snapshot module is configured in a distributed storage system, wherein the data operation request at least comprises: a data write request and a data read request; determining an operation mode corresponding to the data operation request according to the detection result; and returning the data operation result to the target client under the condition of successful operation.

Optionally, the program controls the device on which the storage medium is located to perform the following functions when running: positioning a target storage module in the distributed storage system according to a target data distribution algorithm and the data operation request, wherein the target storage module is used for maintaining a metadata base of the distributed storage system; and detecting whether the target storage module is provided with the snapshot module or not to obtain the detection result.

Optionally, the program controls the device on which the storage medium is located to perform the following functions when running: if the detection result is that the snapshot module is configured in the distributed storage system, writing a snapshot object and a new object in the distributed storage system into a metadata base, and writing data to be written into the new object based on a target distribution granularity, wherein the new object is a newly-built object for storing the data to be written; and if the detection result indicates that the snapshot module is not configured in the distributed storage system, writing the data to be written into an original storage object corresponding to the data to be written, wherein the first identification code of the snapshot object is the same as the second identification code of the original storage object.

Optionally, the program controls the device on which the storage medium is located to perform the following functions when running: renaming the original storage object in the distributed storage system as the snapshot object; and determining a new object name of the new object according to the original name of the original storage object, wherein the metadata base stores a first mapping relation between the first identification code and the new object name and a second mapping relation between the second identification code and the new object name.

Optionally, the program controls the device on which the storage medium is located to perform the following functions when running: and returning the first identification code of the snapshot object and the second identification code of the newly-built object to the target client.

Optionally, the program controls the device on which the storage medium is located to perform the following functions when running: if the detection result indicates that the snapshot module is configured in the distributed storage system, locating a target disk location in a metadata base according to identification code information carried in the data reading request, and reading target reading data corresponding to the data reading request from the target disk location, wherein the identification code information at least includes: the first identification code of the snapshot object and the second identification code of the newly-built object; and if the detection result indicates that the snapshot module is not configured in the distributed storage system, positioning the target disk position according to a third identification code of an original storage object in the distributed storage system, and reading the target read data from the target disk position.

Optionally, the program controls the device on which the storage medium is located to perform the following functions when running: and returning and reading the target reading data to the target client.

According to the embodiment of the application, the embodiment of the processor is also provided. Optionally, in this embodiment, the processor is configured to execute a program, where the program executes to execute any one of the distributed storage data processing methods.

The embodiment of the application provides equipment, the equipment comprises a processor, a memory and a program which is stored on the memory and can run on the processor, and the following steps are realized when the processor executes the program: under the condition of receiving a data operation request from a target client, detecting whether a snapshot module is configured in a distributed storage system, wherein the data operation request at least comprises: a data write request and a data read request; determining an operation mode corresponding to the data operation request according to the detection result; and returning the data operation result to the target client under the condition of successful operation.

Optionally, when the processor executes a program, it may further locate a target storage module in the distributed storage system according to a target data distribution algorithm and the data operation request, where the target storage module is used to maintain a metadata base of the distributed storage system; and detecting whether the target storage module is provided with the snapshot module or not to obtain the detection result.

Optionally, when the processor executes a program, if the detection result indicates that the snapshot module is configured in the distributed storage system, writing a snapshot object and a new object in the distributed storage system into a metadata base, and writing data to be written into the new object based on a target allocation granularity, where the new object is a new object created for storing the data to be written; and if the detection result indicates that the snapshot module is not configured in the distributed storage system, writing the data to be written into an original storage object corresponding to the data to be written, wherein the first identification code of the snapshot object is the same as the second identification code of the original storage object.

Optionally, when the processor executes a program, it may rename an original storage object in the distributed storage system to the snapshot object; and determining a new object name of the new object according to the original name of the original storage object, wherein the metadata base stores a first mapping relation between the first identification code and the new object name and a second mapping relation between the second identification code and the new object name.

Optionally, when the processor executes a program, the processor may return the first identifier of the snapshot object and the second identifier of the new object to the target client.

Optionally, when the processor executes a program, if the detection result indicates that the snapshot module is configured in the distributed storage system, the processor may further locate a target disk location in a metadata base according to identification code information carried in the data reading request, and read target read data corresponding to the data reading request from the target disk location, where the identification code information at least includes: the first identification code of the snapshot object and the second identification code of the newly-built object; and if the detection result indicates that the snapshot module is not configured in the distributed storage system, positioning the target disk position according to a third identification code of an original storage object in the distributed storage system, and reading the target read data from the target disk position.

Optionally, when the processor executes a program, the read target read data may be returned to the target client.

The present application further provides a computer program product adapted to perform a program for initializing the following method steps when executed on a data processing device: under the condition of receiving a data operation request from a target client, detecting whether a snapshot module is configured in a distributed storage system, wherein the data operation request at least comprises: a data write request and a data read request; determining an operation mode corresponding to the data operation request according to the detection result; and returning the data operation result to the target client under the condition of successful operation.

Optionally, when the computer program product executes a program, a target storage module in the distributed storage system may be located according to a target data distribution algorithm and the data operation request, where the target storage module is used to maintain a metadata base of the distributed storage system; and detecting whether the target storage module is provided with the snapshot module or not to obtain the detection result.

Optionally, when the computer program product executes a program, if the detection result indicates that the snapshot module is configured in the distributed storage system, writing a snapshot object and a new object in the distributed storage system into a metadata base, and writing data to be written into the new object based on a target allocation granularity, where the new object is a newly created object for storing the data to be written; and if the detection result indicates that the snapshot module is not configured in the distributed storage system, writing the data to be written into an original storage object corresponding to the data to be written, wherein the first identification code of the snapshot object is the same as the second identification code of the original storage object.

Optionally, when the computer program product executes a program, the original storage object in the distributed storage system may be renamed to the snapshot object; and determining a new object name of the new object according to the original name of the original storage object, wherein the metadata base stores a first mapping relation between the first identification code and the new object name and a second mapping relation between the second identification code and the new object name.

Optionally, when the computer program product executes a program, the first identifier of the snapshot object and the second identifier of the new object may be returned to the target client.

Optionally, when the computer program product executes a program, if the detection result indicates that the snapshot module is configured in the distributed storage system, the computer program product may further locate a target disk location in a metadata base according to identification code information carried in the data reading request, and read target reading data corresponding to the data reading request from the target disk location, where the identification code information at least includes: the first identification code of the snapshot object and the second identification code of the newly-built object; and if the detection result indicates that the snapshot module is not configured in the distributed storage system, positioning the target disk position according to a third identification code of an original storage object in the distributed storage system, and reading the target read data from the target disk position.

Optionally, when the computer program product executes a program, the read target read data may be returned to the target client.

The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.

In the above embodiments of the present invention, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

In the embodiments provided in the present application, it should be understood that the disclosed technology can be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units may be a logical division, and in actual implementation, there may be another division, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.

The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims

1. A distributed storage data processing method, comprising:

under the condition of receiving a data operation request from a target client, detecting whether a snapshot module is configured in a distributed storage system, wherein the data operation request at least comprises: a data write request and a data read request;

determining an operation mode corresponding to the data operation request according to the detection result;

returning a data operation result to the target client under the condition of successful operation;

wherein, when the data operation request is a data write request, determining an operation mode corresponding to the data operation request according to a detection result, including:

if the detection result is that the snapshot module is configured in the distributed storage system, writing a snapshot object and a newly-built object in the distributed storage system into a metadata base, and writing data to be written into the newly-built object based on target distribution granularity, wherein the newly-built object is a newly-built object for storing the data to be written;

and if the detection result indicates that the snapshot module is not configured in the distributed storage system, writing the data to be written into an original storage object corresponding to the data to be written, wherein a first identification code of the snapshot object is the same as a second identification code of the original storage object.

2. The method of claim 1, wherein detecting whether a snapshot module is configured in the distributed storage system comprises:

positioning a target storage module in the distributed storage system according to a target data distribution algorithm and the data operation request, wherein the target storage module is used for maintaining a metadata base of the distributed storage system;

and detecting whether the target storage module is configured with the snapshot module or not to obtain the detection result.

3. The method of claim 1, wherein prior to writing the snapshot objects and the new objects in the distributed storage system to the metadata repository, the method further comprises:

renaming an original storage object in the distributed storage system as the snapshot object;

and determining a new object name of the new object according to the original name of the original storage object, wherein a first mapping relation between the first identification code and the new object name and a second mapping relation between the second identification code and the new object name are stored in the metadata base.

4. The method of claim 1, wherein in the case that the data operation request is a data write request, returning a data operation result to the target client comprises: and returning the first identification code of the snapshot object and the second identification code of the newly-built object to the target client.

5. The method of claim 1, wherein the target allocation granularity ranges from 4KB to 4 MB.

6. The method of claim 1, wherein determining an operation mode corresponding to the data operation request according to the detection result when the data operation request is a data read request comprises:

if the detection result is that the snapshot module is configured in the distributed storage system, locating a target disk position in a metadata base according to identification code information carried in the data reading request, and reading target reading data corresponding to the data reading request from the target disk position, wherein the identification code information at least comprises: the first identification code of the snapshot object and the second identification code of the newly-built object;

and if the detection result indicates that the snapshot module is not configured in the distributed storage system, positioning the target disk position according to a third identification code of an original storage object in the distributed storage system, and reading the target read data from the target disk position.

7. The method of claim 6, wherein in the case that the data operation request is a data read request, returning a data operation result to the target client comprises: and returning the read target reading data to the target client.

8. A distributed storage data processing apparatus, comprising:

a detection module, configured to detect whether a snapshot module is configured in a distributed storage system under a condition that a data operation request from a target client is received, where the data operation request at least includes: a data write request and a data read request;

the determining module is used for determining an operation mode corresponding to the data operation request according to the detection result;

the return module is used for returning a data operation result to the target client under the condition of successful operation;

when the data operation request is a data write request, the determining module is further configured to, if the detection result is that the snapshot module is configured in the distributed storage system, write a snapshot object and a new object in the distributed storage system into a metadata base, and write data to be written into the new object based on a target allocation granularity, where the new object is a newly-built object for storing the data to be written; and if the detection result indicates that the snapshot module is not configured in the distributed storage system, writing the data to be written into an original storage object corresponding to the data to be written, wherein a first identification code of the snapshot object is the same as a second identification code of the original storage object.

9. A storage medium, characterized in that the storage medium comprises a stored program, wherein when the program runs, a device where the storage medium is located is controlled to execute the distributed storage data processing method according to any one of claims 1 to 7.

10. A processor, characterized in that the processor is configured to execute a program, wherein the program executes the distributed storage data processing method according to any one of claims 1 to 7.