CN111078653B - Data storage method, system and equipment - Google Patents

Data storage method, system and equipment Download PDF

Info

Publication number
CN111078653B
CN111078653B CN201911038492.2A CN201911038492A CN111078653B CN 111078653 B CN111078653 B CN 111078653B CN 201911038492 A CN201911038492 A CN 201911038492A CN 111078653 B CN111078653 B CN 111078653B
Authority
CN
China
Prior art keywords
file
target
storage
data
fragment
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911038492.2A
Other languages
Chinese (zh)
Other versions
CN111078653A (en
Inventor
刘太良
孙细妹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiamen Wangsu Co Ltd
Original Assignee
Xiamen Wangsu Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiamen Wangsu Co Ltd filed Critical Xiamen Wangsu Co Ltd
Priority to CN201911038492.2A priority Critical patent/CN111078653B/en
Publication of CN111078653A publication Critical patent/CN111078653A/en
Application granted granted Critical
Publication of CN111078653B publication Critical patent/CN111078653B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/176Support for shared access to files; File sharing support
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/14Details of searching files based on file metadata
    • G06F16/148File search processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/1734Details of monitoring file system events, e.g. by the use of hooks, filter drivers, logs

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Library & Information Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a data storage method, a system and equipment, wherein the method comprises the following steps: generating a suffix of each fragment data according to the uploading identifier of the target object and the fragment number of each fragment data; according to a target file directory and the target object, a file storage path of the target object is constructed, a self-defined storage directory is created under the file storage path, and all the fragment data are written into the self-defined storage directory according to respective fragment names; after uploading of the fragment data is completed, a virtual complete file is created under the file storage path, and file attributes are configured for the virtual complete file, wherein the file attributes are at least used for representing the actual data size of the target object, the fragment rule of the target object and the self-defined storage directory. According to the technical scheme, the object uploaded by the object storage service can be accessed by the file storage system.

Description

Data storage method, system and equipment
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a data storage method, system, and device.
Background
With the continuous development of cloud Storage technology, more and more traditional enterprises desire that data in a file system can be accessed through an S3 (Simple Storage Service) interface, or desire that data uploaded through the S3 interface can be accessed through a traditional file system.
Therefore, there is a need for a method of enabling data sharing between a file storage system and an object storage system. One premise for implementing the method is to make the object uploaded by the object storage service accessible to the file storage system. However, the contradiction that the object storage service uploads the data in a fragmentation mode and the file storage system accesses the file in a complete mode generally makes data sharing difficult and serious. In view of the above, there is a need for a different data storage method, so that the object uploaded by the object storage service can be accessed by the file storage system.
Disclosure of Invention
The application aims to provide a data storage method, a data storage system and data storage equipment, which can enable an object uploaded by an object storage service to be accessed by a file storage system.
In order to achieve the above object, an aspect of the present application provides a data storage method, where a target object to be uploaded is divided into buckets, and the buckets are mapped to target file directories, where the method includes: generating suffixes of the fragment data according to the uploading identification of the target object and the fragment numbers of the fragment data contained in the target object; wherein the object name of the target object and the suffix of the fragment data form the fragment name of the fragment data; according to the target file directory and the target object, a file storage path of the target object is constructed, a self-defined storage directory is created under the file storage path, and all the fragment data are written into the self-defined storage directory according to the respective fragment names; wherein the custom storage catalog is imperceptible to a visitor; after uploading of the fragment data is completed, a virtual complete file is created under the file storage path, and file attributes are configured for the virtual complete file, wherein the file attributes are at least used for representing the actual data size of the target object, the fragment rule of the target object and the self-defined storage directory.
In order to achieve the above object, another aspect of the present application further provides a data storage system, where a target object to be uploaded is partitioned into buckets, and the buckets and target file directories are mapped to each other, where the system includes: a suffix generation unit, configured to generate a suffix for each piece of fragment data according to the upload identifier of the target object and the fragment number of each piece of fragment data included in the target object; wherein the object name of the target object and the suffix of the fragment data form the fragment name of the fragment data; the data writing unit is used for constructing a file storage path of the target object according to the target file directory and the target object, creating a self-defined storage directory under the file storage path, and writing each piece of fragment data into the self-defined storage directory according to the respective fragment name; wherein the custom storage catalog is imperceptible to a visitor; and the virtual complete file configuration unit is used for creating a virtual complete file under the file storage path after the uploading of the fragment data is finished, and configuring file attributes for the virtual complete file, wherein the file attributes are at least used for representing the actual data size of the target object, the fragment rule of the target object and the self-defined storage directory.
To achieve the above object, another aspect of the present application further provides a data storage device, which includes a processor and a memory, wherein the memory is used for storing a computer program, and the computer program, when executed by the processor, implements the above data storage method.
As can be seen from the above, according to the technical solutions provided in one or more embodiments of the present application, when a target object is uploaded, each piece of fragment data in the target object may be stored in a customized storage directory according to its own fragment name in a customized storage directory storage manner, and the piece of fragment data of the target object may be accessed by a file storage client in a manner of creating a virtual complete file, so that data sharing between an object storage service and a file storage service may be implemented.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a system architecture diagram illustrating data sharing according to an embodiment of the present invention;
FIG. 2 is a diagram illustrating steps of data sharing according to an embodiment of the present invention;
FIG. 3 is a diagram illustrating steps of metadata writing according to an embodiment of the present invention;
FIG. 4 is a diagram illustrating a step of uploading fragmented data in an embodiment of the present invention;
FIG. 5 is a schematic diagram illustrating the steps of file access in an embodiment of the present invention;
FIG. 6 is a flow chart of file access in an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more clear, the technical solutions of the present application will be clearly and completely described below with reference to the detailed description of the present application and the accompanying drawings. It should be apparent that the described embodiments are only some of the embodiments of the present application, and not all of the embodiments. All other embodiments obtained by a person of ordinary skill in the art without any inventive work based on the embodiments in the present application are within the scope of protection of the present application.
One embodiment of the present application may provide a data storage method, which may be applied to the system architecture shown in fig. 1. Specifically, in order to implement data sharing between the file storage system and the object storage system, both the object storage client and the file storage client may be simultaneously accessed into the system architecture shown in fig. 1, and a user may upload data and download data through the two clients. Of course, with the advancement of technology, the object storage client and the file storage client may be combined into the same client, but the client may support the functions of object storage and file storage at the same time.
In practical applications, the object storage system may manage the uploaded objects through buckets (buckets), generally speaking, the objects in the buckets are managed in a flat manner, only the names and data of the objects are included, and the concept of directory does not exist. In a file storage system, files are usually stored in various directories. To implement data sharing between the two systems, in one embodiment of the present application, the uploaded data may be processed according to the following steps shown in fig. 2.
S11: and creating a bucket and establishing a mapping relation between the bucket and the target file directory.
In this embodiment, after a bucket is created in the object storage system, the bucket may be mapped to a target file directory in the file storage system. Specifically, the bucket name of the bucket may be globally unique in the object storage system, the target file directory may also be globally unique in the file storage system, and after the mapping relationship between the bucket and the target file directory is established, the mapping relationship may be maintained in the system architecture shown in fig. 1. In this way, the buckets can be kept synchronized with the metadata under the mapping directory. If a metadata change occurs in the bucket, the metadata in the target file directory is changed accordingly. Similarly, if the metadata in the target file directory is changed, the metadata contained in the buckets are synchronously updated. For example, a bucket bucketta is currently created in the object storage system and may be mapped to a directory/NAS _ DIR/buckettadir in the file storage system. Subsequently, all data in bucket bucketta can be accessed through directory/NAS _ DIR/buckettedir.
It should be noted that the data uploaded by either the object storage system or the file storage system may be finally stored in the back-end data storage cluster shown in fig. 1, and will not store two pieces of data for two different systems. However, the metadata may be stored in the object storage system and the file storage system, respectively, so as to read corresponding data from the backend data storage cluster according to the respective metadata.
S13: and when a target object is uploaded to the storage bucket through an object storage client, constructing a file storage path of the target object according to the target file directory and the target object, and writing the target object under the file storage path.
In this embodiment, after the mapping relationship between the buckets and the target file directories is established, if the target object is uploaded to the bucket through the object storage client, the target object may be divided into the buckets according to a conventional processing manner in the object storage system. Meanwhile, in order to keep the buckets synchronized with the metadata of the target file directory, a file storage path of the target object in the file storage system may be constructed according to the target file directory and the target object. Specifically, it may be identified whether the object name of the target object includes a prefix directory, and if the prefix directory is included, a sub-directory corresponding to the prefix directory may be created in the target file directory. For example, the object name of the target object is dir2/fileA, and the object name includes the prefix directory dir2. At this time, a subdirectory DIR2 may be created under the target file directory/NAS _ DIR/buckettedir, and a storage path/NAS _ DIR/buckettedir/DIR 2 including the subdirectory is obtained. The storage path containing the subdirectory can be used as a file storage path of the target object. And if the prefix directory is not contained in the object name, the target file directory can be directly used as the file storage path of the target object.
In this embodiment, after the file storage path of the target object is constructed, the target object can be written under the file storage path. For example, the final location of the target object DIR2/fileA containing the prefix directory under the target file directory may be/NAS _ DIR/buckettadir/DIR 2/fileA.
S15: when a target file is written into the target file directory through a file storage client, dividing the target file into the buckets, and constructing query statements for querying the target file in the buckets according to the buckets and the target file.
In this embodiment, the user may also write the target file into the target file directory through the file storage client. At this point, after the target file is written under the target file directory in a manner defined by the file storage client, the target file may be partitioned into buckets of the target file directory map in order to keep the buckets synchronized with the metadata of the target file directory. Specifically, the file storage system may write the file DIR2/file b under the target file directory by the file write command vim/NAS _ DIR/buckettadir/DIR 2/file b. Then, the file dir2/file b may be divided into mapped buckets bucketta, and thus, according to the buckets bucketta and the target file dir2/file b, a query statement for querying the target file dir2/file b in the buckets bucketta may be constructed: bucketA/dir2/fileB. Through the query statement, the target file dir2/fileB can be queried normally in the bucket bucketta.
In one embodiment, the buckets and target file directories, which are mapped to each other, may also maintain data synchronization when data deletion occurs. Specifically, if the target object uploaded before is deleted by the object storage client, the target object may be synchronously deleted from the file storage path. If the target file uploaded before is deleted through the file storage client, the target file can also be synchronously removed from the mapped bucket, and meanwhile, the query statement for querying the target file in the bucket can be set to be invalid. For example, originally, two files dir2/fileA and dir2/fileB exist in a bucket and a target file directory which are mapped to each other, if dir2/fileA is deleted by an object storage client, only dir2/fileB remains in the target file directory, and meanwhile, object query cannot be performed in the bucket through a query statement buta/dir 2/fileA.
Referring to fig. 1 and 3, data may be written to buckets and target file directories that are mapped to each other in a number of steps as follows.
S21: and receiving a write-in request, and writing target data corresponding to the write-in request into a back-end data storage cluster according to the target file directory through a data write-in service.
In this embodiment, after receiving a write request for target data, an object storage client or a file storage client may first deliver the target data to a data write service and write the target data into a backend data storage cluster. The back-end data storage cluster can feed back the confirmation information that the data has been written to the data writing service after the target data is landed.
Specifically, the target data may be a target object uploaded by the object storage client, or a target file uploaded by the file storage client, and the data writing service may write the target data in the back-end data storage cluster according to the target file directory in the manner of steps S13 and S15.
S23: and the data writing service generates metadata of the target data and writes the metadata into a file metadata service and an object metadata service respectively.
In this embodiment, in order to keep the metadata in the buckets and the target file directories mapped to each other synchronized, the data writing service may notify the metadata writing service when writing the target data, and write the metadata of the target data into the file metadata service and the object metadata service, respectively.
Specifically, the metadata may be used to describe the target data, and in the metadata, a file storage path of the target data and a name of the target data, a bucket name of a bucket into which the target data is divided, a data identifier of the target data, a size of the target data, a modification time of the target data, and other series of description parameters may be included. In the existing data storage system, only one copy of metadata is usually written when data is written. If only one piece of metadata is written in the file metadata service in the present application, when an object needs to be read from a bucket, the size of the data in the bucket needs to be counted, and the target file directory to which the bucket is mapped and each sub-directory under the target file directory need to be traversed to recalculate the metadata of the bucket, which undoubtedly increases the data reading time. Therefore, in this embodiment, the metadata can be written into the file metadata service and the object metadata service at the same time, and then, when data is read, the respective metadata can be referred to, thereby improving the efficiency of data reading.
S25: the file metadata service writes the metadata into the back-end data storage cluster, and the object metadata service writes an operation log corresponding to the metadata into a key-value pair database, so that the key-value pair database processes the operation log to obtain a statistical result of the storage bucket.
In this embodiment, after receiving the metadata, the file metadata service may write the metadata into the back-end data storage cluster. And the object metadata service can write an operation log (op log) corresponding to the metadata into a key-value pair database (kv database). Subsequently, each operation log can be taken out from the key value database by an asynchronous processing method to be processed, so that statistical results of information used for representing the number of objects stored in the storage bucket, the sum of data of each object in the storage bucket and the like are obtained.
In one embodiment, after obtaining the statistics of the bucket, the object metadata service may write the statistics into header information (bucket). The storage record may include information such as a data name of the target data, a data size of the target data, and a modification time of the target data.
In one embodiment, the file metadata service may write storage information for the target data into the back-end data storage cluster. Specifically, the storage information may include a data identification of the target data, a data size of the target data, a modification time of the target data, and the like. Wherein, the data identification (inode) of the target data can be a unique number of the target data, and the data identification is not repeated in the file storage system. In addition, since the target data may be uploaded through the object storage client, the target data may include a plurality of fragmented data. For example, the target data is data with a data size of 30M, and 4 pieces of sliced data may be included in the target data, where the data size of the first three pieces of sliced data is 8M, and the data size of the last piece of sliced data may be 6M. In this case, when the file metadata service writes storage information into the backend data storage cluster, the fragmentation rule of the target data may also be written. The fragmentation rule at least can represent the starting number, the starting offset, the fragmentation size and the uploading identifier of each fragmentation data in the target data. For example, 30M target data is uploaded through the object storage client, and the fragment data in the target data is divided according to 8M data size. In this case, the target data corresponds to only one fragmentation rule, which may be as follows:
the starting number =1, the starting offset =0, the fragmentation size =8M, and the upload identifier =2 to 123457
According to the fragmentation rule, four pieces of fragmentation data with fragmentation sizes of 8M, 8M and 6M can be obtained, and the fragmentation numbers of the four pieces of fragmentation data are 1, 2, 3 and 4 in sequence.
However, in some scenarios, there may be multiple different fragmentation rules for the same piece of target data. For example, a 50M target data may have two fragmentation rules as follows:
rule 1: the starting number =1, the starting offset =0, the fragment size =6M, and the upload identifier =2 to 123456
Rule 2: the starting number =4, the starting offset =18M, the fragment size =8M, and the upload identifier = 2-123456
It can be seen that the sizes of the first three pieces of sliced data of the target data are all 6M, and the sizes of the first three pieces of sliced data are all 8M from the fourth piece of sliced data.
It should be noted that the fragmented data in the same target data may be uploaded by different clients, and different uploading identifiers may be allocated to different clients to distinguish the different clients. Therefore, even if the clients upload the fragment data in the same target data, the clients can be distinguished through different uploading identifications.
In one embodiment, in order to enable the fragmented data uploaded by the object storage client to be normally accessed by the file storage client, storage and access of the fragmented data may be implemented by creating a virtual full file and a customized storage directory under the target file directory. The customized storage directory may be a shadow directory and is set to be imperceptible to the visitor. Specifically, referring to fig. 4, the storage of the target object including a plurality of fragment data may be performed according to the following steps.
S31: generating suffixes of the fragment data according to the uploading identification of the target object and the fragment numbers of the fragment data; wherein the object name of the target object and the suffix of the fragment data constitute the fragment name of the fragment data.
In this embodiment, the fragment data included in the target object may be numbered in sequence, and for the same target object, each fragment data may correspond to the same upload identifier in the upload process of this time. In order to distinguish different fragment data, suffixes of the fragment data may be generated according to the upload identifier and the fragment number. Specifically, a combination of the upload identifier and the slice number may be used as a suffix of the slice data. For example, the upload identifier of the target object is 2 to 123456, the target object includes 4 pieces of fragment data, the fragment numbers of the 4 pieces of fragment data are 1, 2, 3, and 4, respectively, and the suffixes of the 4 pieces of fragment data are 2 to 123456.1, 2 to 123456.2, 2 to 123456.3, and 2 to 123456.4, respectively. Of course, if all the fragment data are uploaded by the same client, the upload identifier may also be omitted, and the fragment number is directly used as a suffix of the fragment data.
In the present embodiment, after the suffix of the fragment data is specified, a combination of the object name of the target object and the suffix of the fragment data may be used as the fragment name of the fragment data. Here, the object name is usually a name not including a prefix directory. For example, if the object name of the uploaded target object is fileA, the segment name of the first segment data may be fileA.2-123456.1. If the object name of the uploaded target object is dir2/fileA, the prefix directory dir2 may be ignored, and the fragment name of the first fragment data may still be filea.2-123456.1.
Whether the prefix directory is carried or not is determined by the object name of the uploaded target object. In some scenarios, to facilitate bulk management of data, a prefix directory may be added to uploaded target objects, such that these target objects may ultimately be stored under the prefix directory. In some scenarios, if the target object needs to be stored in the mapped target file directory, the prefix directory does not need to be carried.
S33: and according to the target file directory and the target object, constructing a file storage path of the target object, creating a self-defined storage directory under the file storage path, and writing each piece of the piece data into the self-defined storage directory according to the respective piece name.
In this embodiment, the fragmented data is generally invisible to the user for the file storage system, and therefore cannot be written directly under the file storage path. To address this problem, a custom storage directory may be created under the file storage path, which may not be visible to the user. Thus, the self-defined storage directory can be used as a subdirectory of the file storage path, and each fragment data can be written into the self-defined storage directory according to the respective name.
For example, if the object name of the currently uploaded target object is DIR2/fileA, the file storage path constructed according to step S13 may be/NAS _ DIR/buckettaddir/DIR 2, and a customized storage directory may be created under the file storage path. After the fragment data with the fragment name of file A.2-123456.1 is written into the path, the specific location/NAS _ DIR/BucketADir/DIR2/. NAS.
S35: after uploading of the fragment data is completed, a virtual complete file is created under the file storage path, and file attributes are configured for the virtual complete file, wherein the file attributes are at least used for representing the actual data size of the target object, the fragment rule of the target object and the self-defined storage directory.
In this embodiment, after each piece of fragment data of the target object is written into the customized storage directory in the above manner, in order to enable the user to normally access the target object through the file storage client, a virtual complete file may be created under the file storage path. For example, in the example of S33, one virtually complete file may be created under file storage path/NAS _ DIR/buckettedir/DIR 2. The virtually complete file is not a real file, but can be considered as an entry for accessing the fragmented data under the customized storage directory, but merely packaged into a file format. Specifically, after the virtual complete file is created, a file attribute may be configured for the virtual complete file, and the file attribute may be used to represent a series of information such as an actual data size of the target object, a fragmentation rule of the target object, and a customized storage directory. The file attributes of the virtually complete file may be written by the file metadata service as part of the metadata into the back-end data storage cluster.
For example, for an uploaded 50M target object, the file attributes of the corresponding virtually complete file may include the following:
and (3) slicing rule set:
rule 1: start number =1, start offset =0, slice size =6M
Rule 2: start number =4, start offset =18M, tile size =8M
Uploading identification: 2 to 123456
Bucket mapping directory: /NAS _ DIR/BucketAIdir
Self-defined storage directory: shadow _ dir
Actual data size: 50M
Object name: dir2/fileA
Of course, the file attribute may also contain more contents as needed, which is not illustrated here.
In one embodiment, after the virtual complete file is created and the corresponding file attribute is configured, the fragmented data in the customized storage directory can be normally accessed by the file storage client. Specifically, referring to fig. 5 and 6, the file access process can be shown as follows.
S41: and the file storage client receives the file access request and inquires whether a target file pointed by the file access request exists or not.
S43: if the target file exists, reading the target file attribute of the target file, and generating a fragment data list of the target file according to a fragment rule in the target file attribute and a self-defined storage directory; wherein the custom storage catalog is imperceptible to a visitor.
In this embodiment, after receiving a file access request directed to a target file, a file storage client may first query whether the target file exists in a back-end data storage cluster. If not, an error prompt message can be fed back. If so, the target file may be read from the back-end data storage cluster and provided to the user.
Specifically, a part of the data in the back-end data storage cluster may be a file uploaded through the file storage client, and another part may be an object uploaded through the object storage client. From the perspective of the file storage client, the files or objects are storage forms of files, except that some files are real files, and some files are virtual complete files generated according to the above steps. For real files, the file storage client can read from the back-end data storage cluster in the existing manner and provide the read files to the user. For the virtual complete file, the file storage client may read the target file attribute of the target file, where the target file attribute may include the above listed information, and may generate the fragment data list of the target file according to the fragment rule and the customized storage directory therein.
In particular, a file directory corresponding to the target file, which may be a file directory to which the bucket maps, may be identified from the target file attributes. For example, the file directory may be/NAS _ DIR/buckettedir in the example of step S35. Then, a storage path of the target file may be generated according to the identified file directory and the file name of the target file. The storage path may be the file storage path generated in step S13. Specifically, it may be determined whether the file name of the target file includes a prefix directory, and if the file name of the target file includes the prefix directory, the prefix directory is used as a subdirectory of the file directory, and a storage path including the subdirectory is used as a storage path of the target file. If the prefix directory is not included, the file directory can be used as a storage path of the target file. Still taking the example of step S35 as an example, the storage path of the generated target file may be/NAS _ DIR/buckettedir/DIR 2. Then, the customized storage directory in the attribute of the target file can be used as a subdirectory of the storage path to generate the storage path of each piece of fragmented data in the target file. The storage path of each sliced data may be, for example,/NAS _ DIR/buckettaddir/DIR 2/. NAS. And storing each fragment data in the target file under the self-defined storage directory. In this way, according to the fragmentation rule in the target file attribute, suffixes of the respective fragmentation data may be determined, wherein the file name of the target file and the determined suffixes may constitute the fragmentation name of the fragmentation data. For the content of this part, reference may be made to the description in step S33, and details are not repeated here.
After the fragment name of each piece of fragment data is obtained, the storage path of the piece of fragment data and the fragment name of the piece of fragment data may be combined into a storage address of the piece of fragment data, and the storage address may be, for example,/NAS _ DIR/buckettaddir/DIR 2/. NAS. In this way, the storage address of each piece of sliced data can form a sliced data list of the target file.
S45: and sequentially reading each fragment data contained in the fragment data list, and integrating the read fragment data into a complete file so as to take the complete file as a response of the file access request.
In this embodiment, a fragment data list of the target file may be obtained according to the file attribute of the virtual complete file, and the fragment data list may point to each fragment data in the customized storage directory. Therefore, the file storage client can sequentially read each fragment data according to the storage address of each fragment data in the fragment data list, integrate the read fragment data into a complete file, and finally provide the complete file to a user as a response of a file access request.
As can be seen from the above, an object uploaded by an object storage client can be stored in a file form by a customized storage manner of a storage directory, and the object can be accessed by the file storage client by a manner of creating a virtual complete file; the files uploaded by the file storage client can be processed in a fragmentation mode and can be normally accessed by the object storage client subsequently, and therefore data sharing between the object storage service and the file storage service is achieved.
The advantage of this process is that, for the conventional file storage system, if a file with a large data size needs to be uploaded, the whole file can be uploaded only by a single client. Through the above processing mode, the file can be divided into a plurality of fragment data, and then the fragment data is uploaded through a plurality of object storage clients. Therefore, the uploading speed of the large files is increased, the files uploaded through the fragments can be normally accessed by the file storage client, and the storage and access efficiency of the files is greatly improved.
In one embodiment, since the data intercommunication between the object storage service and the file storage service is realized in the above manner, the file storage client can realize the fragment downloading of the file without downloading the complete file, thereby saving the bandwidth. Specifically, the received file download request may carry an interval parameter of the target file, where the interval parameter may be a start data volume and an end data volume, and the start data volume and the end data volume may define a file segment to be downloaded. In addition, the interval parameter may also be a starting fragment number and an ending fragment number, and one or more fragment data defined by the starting fragment number and the ending fragment number may serve as a file fragment to be downloaded. In this way, the file segment to be downloaded in the target file can be determined according to the interval parameter, so that the file segment can be provided to the initiator of the file download request.
An embodiment of the present application further provides a data storage system, where a target object to be uploaded is divided into buckets, and the buckets and a target file directory are mapped to each other, where the system includes:
a suffix generation unit, configured to generate a suffix for each piece of fragment data according to the upload identifier of the target object and the fragment number of each piece of fragment data included in the target object; wherein the object name of the target object and the suffix of the fragment data form the fragment name of the fragment data;
the data writing unit is used for constructing a file storage path of the target object according to the target file directory and the target object, creating a self-defined storage directory under the file storage path, and writing each piece of fragment data into the self-defined storage directory according to the respective fragment name;
and the virtual complete file configuration unit is used for creating a virtual complete file under the file storage path after the uploading of the fragment data is finished, and configuring file attributes for the virtual complete file, wherein the file attributes are at least used for representing the actual data size of the target object, the fragment rule of the target object and the self-defined storage directory.
An embodiment of the present application further provides a data storage device, which includes a processor and a memory, where the memory is used to store a computer program, and the computer program can implement the data storage method described above when executed by the processor.
In this embodiment, the memory may include a physical device for storing information, and typically, the information is digitized and then stored in a medium using an electrical, magnetic, or optical method. The memory according to this embodiment may further include: devices that store information using electrical energy, such as RAM or ROM; devices that store information using magnetic energy, such as hard disks, floppy disks, tapes, core memories, bubble memories, or usb disks; devices for storing information optically, such as CDs or DVDs. Of course, there are other ways of memory, such as quantum memory or graphene memory, among others.
In this embodiment, the processor may be implemented in any suitable manner. For example, the processor may take the form of, for example, a microprocessor or processor and a computer-readable medium that stores computer-readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, an Application Specific Integrated Circuit (ASIC), a programmable logic controller, an embedded microcontroller, and so forth.
As can be seen from the above, according to the technical solutions provided in one or more embodiments of the present application, a bucket in an object storage system may be mapped with a target file directory in a file storage system. The buckets mapped to each other may remain synchronized with the metadata in the target file directory. Specifically, the target object or the target file may be uploaded through an object storage client or a file storage client. When the target object is uploaded to the bucket through the object storage client, a file storage path of the target object can be constructed according to the target file directory and the target object, and the target object is written in the file storage path. When a target file is written into a target file directory through a file storage client, the target file can be divided into storage buckets which are mapped, and a query statement for querying the target file in the storage bucket is constructed according to the storage buckets and the target file. In this way, whether data is uploaded through the object storage system or the file storage system, the data can be written under the target file directory, and the uploaded data does not need to be repeatedly written. Because the storage bucket and the target file directory are mapped with each other, the uploaded data can be divided into the storage bucket and can also be written into the mapped target file directory, and the corresponding data can be inquired through the inquiry statement constructed in the storage bucket, so that the data sharing of the file storage system and the object storage system is realized efficiently.
In addition, according to the technical scheme provided by one or more embodiments of the present application, when a target object is uploaded, each piece of fragment data in the target object may be stored in a customized storage directory according to a respective fragment name in a customized storage directory storage manner, and the piece of fragment data of the target object may be accessed by a file storage client in a manner of creating a virtual complete file, so that data sharing between an object storage service and a file storage service may be implemented.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments can be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for embodiments of the system and the apparatus, reference may be made to the introduction of embodiments of the method described above in contrast to the explanation.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrases "comprising a," "8230," "8230," or "comprising" does not exclude the presence of other like elements in a process, method, article, or apparatus comprising the element.
The above description is only an embodiment of the present application, and is not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims (14)

1. A data storage method is characterized in that a target object to be uploaded is divided into buckets, and the buckets and a target file directory are mapped with each other, wherein the method comprises the following steps:
generating suffixes of the fragment data according to the uploading identification of the target object and the fragment numbers of the fragment data contained in the target object; wherein the object name of the target object and the suffix of the fragment data form the fragment name of the fragment data;
according to the target file directory and the target object, a file storage path of the target object is constructed, a self-defined storage directory is created under the file storage path, and all the fragment data are written into the self-defined storage directory according to the respective fragment names; wherein the custom storage directory is imperceptible to a visitor;
after uploading of the fragment data is completed, a virtual complete file is created under the file storage path, and file attributes are configured for the virtual complete file, wherein the file attributes are at least used for representing the actual data size of the target object, the fragment rule of the target object and the self-defined storage directory.
2. The method of claim 1, wherein after configuring the virtually complete file with file attributes, the method further comprises:
receiving a file access request, and inquiring whether a target file pointed by the file access request exists or not;
if the target file exists, reading the target file attribute of the target file, and generating a fragment data list of the target file according to a fragment rule in the target file attribute and a self-defined storage directory;
and sequentially reading each fragment data contained in the fragment data list, and integrating the read fragment data into a complete file so as to take the complete file as a response of the file access request.
3. The method of claim 2, wherein generating the sharded data list of the target file according to the sharding rules in the target file attributes and the customized storage directory comprises:
identifying a file directory corresponding to the target file from the target file attribute, and generating a storage path of the target file according to the identified file directory and the file name of the target file;
taking the self-defined storage directory in the target file attribute as a subdirectory of the storage path to generate a storage path of each fragment data in the target file;
determining suffixes of all fragment data according to the fragment rules in the target file attributes, wherein the file name of the target file and the determined suffixes form the fragment name of the fragment data;
and combining the storage path of the fragment data and the fragment name of the fragment data into the storage address of the fragment data, wherein the storage address of each fragment data forms a fragment data list of the target file.
4. The method of claim 3, wherein generating the storage path for the target file comprises:
judging whether the file name of the target file contains a prefix directory or not, if so, taking the prefix directory as a subdirectory of the file directory, and taking a storage path containing the subdirectory as a storage path of the target file;
and if the prefix directory is not included, taking the file directory as a storage path of the target file.
5. The method of claim 3, wherein determining suffixes for respective sliced data comprises:
and determining the fragment number of each fragment data according to the fragment rule of the target file attribute, and taking the fragment number as a suffix of the fragment data.
6. The method of claim 1, wherein constructing a file storage path for the target object based on the target file directory and the target object comprises:
and identifying whether the object name of the target object contains a prefix directory, if so, taking the prefix directory as a subdirectory under the target file directory, and taking a storage path containing the subdirectory as a file storage path of the target object.
7. The method of claim 1, further comprising:
when a target file is written into the target file directory through a file storage client, dividing the target file into the buckets, and constructing query statements for querying the target file in the buckets according to the buckets and the target file.
8. The method of claim 7, further comprising:
if the target object is deleted through the object storage client, deleting the target object from the file storage path;
and if the target file is deleted through the file storage client, removing the target file from the storage bucket, and setting the query statement for querying the target file in the storage bucket as invalid.
9. The method of claim 7, wherein when uploading a target object or writing a target file, the method further comprises:
generating metadata of the target object or the target file, and writing the metadata into a file metadata service and an object metadata service respectively; the file metadata service writes the metadata into a back-end data storage cluster for storing the target object and/or the target file, and the object metadata service writes an operation log corresponding to the metadata into a key-value pair database, so that the key-value pair database processes the operation log to obtain a statistical result of the storage bucket.
10. The method of claim 9, further comprising:
the object metadata service writes the statistics of the buckets into header information of the buckets and generates storage records for the target objects and/or the target files, and writes the storage records into an object collection set of the buckets.
11. The method of claim 9, further comprising:
the file metadata service writes storage information of the target object and/or the target file into the backend data storage cluster, wherein the storage information is used for representing at least one of data identification, fragment size and modification time of the target object and/or the target file;
if the target object comprises a plurality of fragment data, the file metadata service further writes a fragment rule of the target object into the back-end data storage cluster, wherein the fragment rule is at least used for representing a start number, a start offset, a fragment size and an uploading identifier of each fragment data in the target object.
12. The method of claim 1, further comprising:
receiving a file downloading request, wherein the file downloading request comprises interval parameters of a target file;
and determining a file segment to be downloaded in the target file according to the interval parameter, and providing the file segment to an initiator of the file downloading request.
13. A data storage system, wherein a target object to be uploaded is partitioned into buckets, the buckets mapped to target file directories, the system comprising:
a suffix generation unit, configured to generate a suffix for each piece of fragment data according to the upload identifier of the target object and the fragment number of each piece of fragment data included in the target object; wherein the object name of the target object and the suffix of the fragment data form the fragment name of the fragment data;
the data writing unit is used for constructing a file storage path of the target object according to the target file directory and the target object, creating a self-defined storage directory under the file storage path, and writing each piece of fragment data into the self-defined storage directory according to the respective fragment name; wherein the custom storage catalog is imperceptible to a visitor;
and the virtual complete file configuration unit is used for creating a virtual complete file under the file storage path after the uploading of the fragment data is finished, and configuring file attributes for the virtual complete file, wherein the file attributes are at least used for representing the actual data size of the target object, the fragment rule of the target object and the self-defined storage directory.
14. A data storage device, characterized in that the device comprises a processor and a memory for storing a computer program which, when executed by the processor, carries out the method according to any one of claims 1 to 12.
CN201911038492.2A 2019-10-29 2019-10-29 Data storage method, system and equipment Active CN111078653B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911038492.2A CN111078653B (en) 2019-10-29 2019-10-29 Data storage method, system and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911038492.2A CN111078653B (en) 2019-10-29 2019-10-29 Data storage method, system and equipment

Publications (2)

Publication Number Publication Date
CN111078653A CN111078653A (en) 2020-04-28
CN111078653B true CN111078653B (en) 2023-03-24

Family

ID=70310578

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911038492.2A Active CN111078653B (en) 2019-10-29 2019-10-29 Data storage method, system and equipment

Country Status (1)

Country Link
CN (1) CN111078653B (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111666257B (en) * 2020-06-03 2024-03-19 中国建设银行股份有限公司 Method, device, equipment and storage medium for file fragment storage
CN111831618A (en) * 2020-07-21 2020-10-27 北京青云科技股份有限公司 Data writing method, data reading method, device, equipment and storage medium
CN111966633B (en) * 2020-08-14 2024-04-09 北京百度网讯科技有限公司 Method, device, electronic equipment and medium for inquiring child node under directory
CN112286465B (en) * 2020-11-03 2023-02-21 浪潮云信息技术股份公司 Rados gateway filing and storing method and system
CN112637616B (en) * 2020-12-08 2024-02-23 网宿科技股份有限公司 Object storage method, system and server
CN112684985B (en) * 2021-01-04 2024-04-05 北京金山云网络技术有限公司 Data writing method and device
CN113238993B (en) * 2021-05-14 2023-12-05 中国人民银行数字货币研究所 Data processing method and device
CN113885799B (en) * 2021-09-29 2024-03-15 济南浪潮数据技术有限公司 Data access method, device, electronic equipment and storage medium
WO2024032262A1 (en) * 2022-08-12 2024-02-15 华为云计算技术有限公司 Object storage service configuration method and apparatus based on cloud computing technology
CN115840786B (en) * 2023-02-20 2023-05-02 北京数元灵科技有限公司 Data lake data synchronization method and device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103902632A (en) * 2012-12-31 2014-07-02 华为技术有限公司 File system building method and device in key-value storage system, and electronic device
CN103955528A (en) * 2014-05-09 2014-07-30 北京华信博研科技有限公司 File data writing method, and file data reading method and device
CN106021462A (en) * 2016-05-17 2016-10-12 深圳市中博科创信息技术有限公司 File storage method of cluster file system and cluster file system

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7996429B2 (en) * 2008-06-12 2011-08-09 Novell, Inc. Mechanisms to persist hierarchical object relations
US8510267B2 (en) * 2011-03-08 2013-08-13 Rackspace Us, Inc. Synchronization of structured information repositories

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103902632A (en) * 2012-12-31 2014-07-02 华为技术有限公司 File system building method and device in key-value storage system, and electronic device
CN103955528A (en) * 2014-05-09 2014-07-30 北京华信博研科技有限公司 File data writing method, and file data reading method and device
CN106021462A (en) * 2016-05-17 2016-10-12 深圳市中博科创信息技术有限公司 File storage method of cluster file system and cluster file system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于REST服务的文件分片保存方法及其实现;姚云潇等;《武汉工程大学学报》;20181015(第05期);全文 *

Also Published As

Publication number Publication date
CN111078653A (en) 2020-04-28

Similar Documents

Publication Publication Date Title
CN111078653B (en) Data storage method, system and equipment
CN111008185B (en) Data sharing method, system and equipment
CN111090618B (en) Data reading method, system and equipment
US11799959B2 (en) Data processing method, apparatus, and system
EP3103025B1 (en) Content based organization of file systems
CN110019004B (en) Data processing method, device and system
CN109033360B (en) Data query method, device, server and storage medium
JP2012089094A5 (en)
CN105100146A (en) Data storage method, device and system
CN106484820B (en) Renaming method, access method and device
CN108614837B (en) File storage and retrieval method and device
CN104020961A (en) Distributed data storage method, device and system
CN101783740B (en) Method and device for managing message file
JP2015510174A (en) Location independent files
CN106708822B (en) File storage method and device
US20180107404A1 (en) Garbage collection system and process
JP4755244B2 (en) Information generation method, information generation program, and information generation apparatus
CN112035413B (en) Metadata information query method, device and storage medium
CN107291524B (en) Remote command processing method and device
US9626378B2 (en) Method for handling requests in a storage system and a storage node for a storage system
CN115129789A (en) Bucket index storage method, device and medium of distributed object storage system
CN109241011B (en) Virtual machine file processing method and device
CN113051301A (en) Object storage method, system and equipment
EP2164005B1 (en) Content addressable storage systems and methods employing searchable blocks
CN113407518B (en) Rowkey design method and device of Hbase database

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant