CN114327285A

CN114327285A - Data storage method, device, equipment and storage medium

Info

Publication number: CN114327285A
Application number: CN202111658879.5A
Authority: CN
Inventors: 郑传义; 苗功勋; 杨生飞; 王敏; 徐国龙
Original assignee: BEIJING ZHONGFU TAIHE TECHNOLOGY DEVELOPMENT CO LTD; Nanjing Zhongfu Information Technology Co Ltd; Zhongfu Information Co Ltd; Zhongfu Safety Technology Co Ltd
Current assignee: BEIJING ZHONGFU TAIHE TECHNOLOGY DEVELOPMENT CO LTD; Nanjing Zhongfu Information Technology Co Ltd; Zhongfu Information Co Ltd; Zhongfu Safety Technology Co Ltd
Priority date: 2021-12-30
Filing date: 2021-12-30
Publication date: 2022-04-12

Abstract

The application provides a data storage method, a data storage device, data storage equipment and a data storage medium, and relates to the technical field of data storage. The method comprises the following steps: splitting a file to be stored into a plurality of file blocks; generating metadata corresponding to the file to be stored according to the identification information of each file block; generating and storing a metadata file corresponding to the file to be stored according to the metadata and other file information; and transmitting each file block of the file to be stored to a public cloud storage for storage so that a user can download the file to be stored on the public cloud storage according to the metadata file. Compared with the prior art, the problem that data security cannot be guaranteed due to the fact that the data are stored on a public cloud is avoided.

Description

Data storage method, device, equipment and storage medium

Technical Field

The present application relates to the field of data storage technologies, and in particular, to a data storage method, apparatus, device, and storage medium.

Background

In the field of data storage, cloud computing is mainly divided into: private clouds, public clouds, and cloudy.

In the prior art, the advantages of public cloud, such as low cost (hardware or software is not needed to be purchased, only payment is made for the service used), low maintenance (maintenance is provided by a service provider), nearly unlimited scalability (providing resources on demand, which can meet business requirements), and high reliability (having a plurality of servers, which ensures protection from failures), are widely used.

However, there is a possibility that data on the public cloud is leaked, that is, the security of data stored on the public cloud is low, and it cannot be guaranteed that the data stored on the public cloud is not leaked.

Disclosure of Invention

An object of the present application is to provide a data storage method, an apparatus, a device, and a storage medium, to solve the problem in the prior art that data stored in a public cloud may be leaked.

In order to achieve the above purpose, the technical solutions adopted in the embodiments of the present application are as follows:

in a first aspect, an embodiment of the present application provides a data storage method, which is applied to a storage gateway of a private cloud, where the method includes:

splitting a file to be stored into a plurality of file blocks;

generating metadata corresponding to the file to be stored according to the identification information of each file block;

generating and storing a metadata file corresponding to the file to be stored according to the metadata and other file information;

and transmitting each file block of the file to be stored to a public cloud storage for storage so that a user can download the file to be stored on the public cloud storage according to the metadata file.

Optionally, the splitting the file to be stored into a plurality of file blocks includes:

and splitting the file to be stored into a plurality of file blocks according to the size of a preset file block.

Optionally, before generating the metadata corresponding to the file to be stored according to the identification information of each file block, the method further includes:

and generating identification information corresponding to each file block according to a preset abstract algorithm.

Optionally, after generating the metadata file corresponding to the file to be stored according to the metadata and the other file information, the method further includes:

and generating preset identification information corresponding to the metadata file according to a preset abstract algorithm.

Optionally, before splitting the file to be stored into a plurality of file blocks, the method further includes:

constructing a preset logic file system;

and acquiring the file to be stored uploaded to the preset logic file system by a user.

Optionally, after generating the metadata corresponding to the file to be stored according to the identification information of each file block, the method further includes:

and storing the metadata to a directory position corresponding to the file to be stored uploaded by the user in the preset logic file system.

performing data encryption processing on each file block to obtain the file block after the encryption processing, wherein the data encryption processing comprises: data encryption or data obfuscation;

and determining a target container of each file block in a public cloud according to the identification information of each file block, and writing the encrypted file block into the corresponding target container.

In a second aspect, another embodiment of the present application provides a data storage device, including: the file to be stored is split into a plurality of file blocks by the splitting module, the generating module and the transmission module; wherein:

the splitting module is used for splitting the file to be stored into a plurality of file blocks;

the generating module is used for generating metadata corresponding to the file to be stored according to the identification information of each file block; generating and storing a metadata file corresponding to the file to be stored according to the metadata and other file information;

the transmission module is used for transmitting each file block of the file to be stored to a public cloud storage for storage, so that a user can download the file to be stored on the public cloud storage according to the metadata file.

Optionally, the splitting module is specifically configured to split the file to be stored into a plurality of file blocks according to a preset file block size.

Optionally, the generating module is specifically configured to generate identification information corresponding to each file block according to a preset digest algorithm.

Optionally, the generating module is specifically configured to generate preset identification information corresponding to the metadata file according to a preset digest algorithm.

Optionally, the apparatus further comprises: the device comprises a construction module and an acquisition module, wherein:

the construction module is used for constructing a preset logic file system;

the acquisition module is used for acquiring the file to be stored uploaded to the preset logic file system by a user.

Optionally, the storage module is specifically configured to store the metadata to a directory location in the preset logical file system, where the user uploads the file to be stored.

Optionally, the apparatus further comprises: a determination module, wherein:

the obtaining module is specifically configured to perform data encryption processing on each file block, and obtain the file block after the data encryption processing, where the data encryption processing includes: data encryption or data obfuscation;

the determining module is configured to determine a target container of each file block in a public cloud according to the identification information of each file block, and write the encrypted file block into the corresponding target container.

In a third aspect, another embodiment of the present application provides a data storage device, including: a processor, a storage medium and a bus, the storage medium storing machine-readable instructions executable by the processor, the processor and the storage medium communicating via the bus when the data storage device is operated, the processor executing the machine-readable instructions to perform the steps of the method according to any one of the first aspect.

In a fourth aspect, another embodiment of the present application provides a storage medium having a computer program stored thereon, where the computer program is executed by a processor to perform the steps of the method according to any one of the above first aspects.

The beneficial effect of this application is: the data storage method divides the to-be-stored file into a metadata file and a file entity, wherein the metadata file is used for describing identification information of a plurality of file blocks of the to-be-stored file and basic information of the to-be-stored file, the metadata file is stored in a private cloud, and the plurality of file blocks of the to-be-stored file are stored in a public cloud, so that the metadata and the file entity are stored separately, the safety of data is ensured, and due to the expandability of a public cloud storage space, the expandability of the data is ensured on the basis of ensuring the safety of the data, and the possibility of data loss is avoided.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained from the drawings without inventive effort.

Fig. 1 is a schematic structural diagram of a hybrid cloud system according to an embodiment of the present disclosure;

fig. 2 is a schematic flowchart of a data storage method according to an embodiment of the present application;

FIG. 3 is a block diagram of a document according to an embodiment of the present application;

FIG. 4 is a schematic flow chart illustrating a data storage method according to another embodiment of the present application;

fig. 5 is a schematic structural diagram of a default logical file system according to an embodiment of the present application;

FIG. 6 is a schematic flow chart illustrating a data storage method according to another embodiment of the present application;

FIG. 7 is a schematic structural diagram of a data storage device according to an embodiment of the present application;

FIG. 8 is a schematic structural diagram of a data storage device according to another embodiment of the present application;

fig. 9 is a schematic structural diagram of a data storage device according to an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments.

The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present application without making any creative effort, shall fall within the protection scope of the present application.

Additionally, the flowcharts used in this application illustrate operations implemented according to some embodiments of the present application. It should be understood that the operations of the flow diagrams may be performed out of order, and steps without logical context may be performed in reverse order or simultaneously. One skilled in the art, under the guidance of this application, may add one or more other operations to, or remove one or more operations from, the flowchart.

The data storage method provided by the embodiment of the present application is explained below with reference to a plurality of specific application examples.

The method provided by the application is applied to a mixed cloud scene, wherein the mixed cloud is a type of cloud computing, and a local infrastructure (or private cloud) and a public cloud are combined together. In the embodiment provided by the application, the hybrid cloud scene comprises a public cloud and a private cloud, data stored in the private cloud generally has higher data security, and the public cloud has the advantages of low cost, no need of maintenance, approximately unlimited scalability, high reliability and the like.

Fig. 1 is a schematic structural diagram of a hybrid cloud system according to an embodiment of the present application, and as shown in fig. 1, a private cloud is used to obtain a user request, and the private cloud includes a storage gateway, data transmission between the private cloud and a public cloud is implemented by the storage gateway of the private cloud, and the public cloud includes a plurality of containers for storing data uploaded by a user, and when the remaining storage capacity of the container is insufficient, expansion of the storage capacity of the public cloud can be implemented by adding the containers.

In the embodiment of the application, when a user sends a document access request through a terminal, the access request is sent to a private cloud, and after the private cloud obtains the document storage request sent by the user, a storage-gateway (storage-gateway) of the private cloud stores a document entity on one or more containers stored in a public cloud or downloads the document entity from the one or more containers stored in the public cloud according to the request content in the document access request sent by the user.

In a possible implementation manner, an embodiment of the present invention provides a data storage method, and fig. 2 is a schematic flowchart of the data storage method provided in an embodiment of the present application, and is applied to a storage gateway of a private cloud, as shown in fig. 2, the method includes:

s101: and splitting the file to be stored into a plurality of file blocks.

The file to be stored may be, for example, an audio file, a document file, a video file, a table file, or a file in any form, and the file form that the file to be stored may include may be flexibly adjusted according to the user requirement, which is not limited to the foregoing embodiment.

In an embodiment of the present application, for example, the file to be stored may be split into a plurality of file blocks according to a preset file block size, that is, the file to be stored is uniformly split according to the preset file block size until the file block size at the end of the splitting is smaller than or equal to the preset file block size.

For example, fig. 3 is a schematic diagram of file splitting provided in an embodiment of the present application, as shown in fig. 3, for example, if the file to be stored is movie.mp4, the size of the file to be stored is 15MB, and the preset file block size is 4MB, the file to be stored may be split into four parts according to the preset file block size, and the four parts are respectively a file block 1(4MB), a file block 2(4MB), a file block 3(4MB), and a file block 1(3 MB); it should be understood that the foregoing embodiment is merely an exemplary illustration, the preset file block size may also be 2MB, 5MB or any size, and the specific preset file block size and the size of the file to be stored may be flexibly adjusted according to the user's needs, and are not limited to the foregoing embodiment.

S102: and generating metadata corresponding to the file to be stored according to the identification information of each file block.

In the embodiment of the present application, the generation manner of the identification information of the file block may be, for example: and generating identification information corresponding to each file block according to a preset abstract algorithm.

In some possible embodiments, the preset digest algorithm may be, for example, the SHA-256 digest algorithm or the digest algorithm MD5, the above digest algorithm is executed on each file block, the unique identification information BlockID of the file block corresponding to each file block may be generated, and then the generated identification information of each file block is recorded in the metadata corresponding to the file to be stored, that is, the metadata does not record the specific content information of each file block, and only records the identification information of each file block; it should be understood that the preset summarization algorithm is only an exemplary illustration, and the algorithm form of the specific preset summarization algorithm can be flexibly adjusted according to the user's needs, and is not limited to the embodiments described above.

In an embodiment of the present application, a file to be stored corresponds to an identification information list, identification information of a plurality of file blocks of the file to be stored is sequentially recorded in the identification information list, and the identification information list is recorded in metadata corresponding to the file to be stored, so that, when the file blocks are subsequently spliced, the file blocks can be spliced directly according to position information of each identification information in the identification list of the metadata, so that the content of the spliced file is consistent with the content of the stored file to be stored, thereby avoiding a problem that only the content of the plurality of file blocks of the file to be stored is known, but the sequence among the plurality of file blocks is not known, so that the spliced file is inconsistent with the file to be stored, that is, the file to be stored cannot be restored, it should be understood that the above embodiment is merely an illustrative example, the storage form of the specific identification information stored in the metadata can be flexibly adjusted according to the user requirement, and is not limited to the embodiment described above.

S103: and generating and storing a metadata file corresponding to the file to be stored according to the metadata and other file information.

For example, in some possible embodiments, the other file information may be, for example: file name information, file type information (for example, may include audio files, video files, document files, table files, etc.), file size information, file creator information, file last modification time information, etc. of the files to be stored, it should be understood that the above are merely exemplary, and the content included in the specific other file information may be flexibly adjusted according to the user's needs, and is not limited to the content provided in the above embodiments.

In the embodiment of the application, for example, the metadata and other file information may be recorded in an Extensible Markup Language (xml) or JavaScript Object Notation (json) file, and the file in which the metadata and other file information are recorded is determined to be a metadata file corresponding to the file to be stored, and through the metadata file, not only a plurality of file block information corresponding to the file to be stored may be determined, but also file related information of the file to be stored may be known.

In an embodiment of the present application, in order to facilitate subsequent search of a metadata file, for a generated metadata file, preset identification information corresponding to the metadata file also needs to be generated according to a preset digest algorithm, and the preset digest algorithm may also be, for example, an SHA-256 digest algorithm, that is, for the generated metadata file, the SHA-256 digest algorithm is executed again to generate unique preset identification information corresponding to the metadata file.

The preset identification information may be, for example, preset Identity Document (ID), so that when a subsequent user downloads a file to be stored, the unique corresponding metadata file may be directly searched and determined in the database according to the ID information (file-ID) of the metadata file, and thus, the relevant information corresponding to the file to be stored is obtained according to the searched and determined metadata file, so that the file to be stored is downloaded according to the relevant information of the file to be stored.

For example, in one possible embodiment of the present application, the data content included in one common metadata file, and the form of the data content may be as follows:

according to the metadata file, it can be known that the metadata file is generated by recording metadata and other file information in a json file, and according to the metadata file, it can be known that the file name of the file to be stored corresponding to the current metadata file is movie. mp4, the file type of the file to be stored is mp4, the file size of the file to be stored is 15728640 bytes, the file to be stored includes 4 file blocks, wherein the sequence among the 4 file blocks and the corresponding identification information are respectively: "bbb30a4afe8d4a5328552cb5fa5067751b2ac16f", "e51af207318c4809dcbd5c2e9b8e1f822d51cd5f", "4cd7ef94c9b994263c1b78479069ca5d687525f7", "ee6ca4f5ff683c91c9681d23fd75fd873ca39c8 d"; it should be understood that the above metadata file is only an exemplary illustration, and the content and the form of the content included in the specific metadata file can be adjusted according to the user's needs and the specific situation of the file to be stored currently, and are not limited to the above embodiments.

Although the unit of the size of the file to be stored may be flexibly adjusted according to the user requirement, in the above embodiment, the unit of the size of the file to be stored is illustrated as byte, but it should be understood that the unit of the size of the file to be stored may also be byte (B), Kilobyte (KB), Megabyte (MB), Gigabyte (GB), and the like, and the unit of the specific size of the file may be predefined or explicitly written in the metadata file, and is not limited to the unit in the above embodiment.

S104: and transmitting each file block of the file to be stored to a public cloud storage for storage.

And downloading the file to be stored on the public cloud storage by the user according to the metadata file.

In the embodiment of the application, the size of the metadata file is generally small because the size of the data included in the metadata file is generally small, and therefore, in the embodiment of the application, the metadata file can be directly stored on the private cloud.

For the file blocks corresponding to the file to be stored, because the size of the file blocks is generally large, and the requirements on the storage performance and the expansibility of the cloud are high, in an embodiment of the present application, the file blocks corresponding to the file to be stored are not stored in metadata, nor stored in a metadata file and/or a private cloud, but the file blocks corresponding to the file to be stored are all stored in a public cloud, wherein the storage manner may be, for example, centralized storage or dispersed storage, in an embodiment of the present application, in order to ensure the security of data, a plurality of file blocks corresponding to the file to be stored are dispersed and stored in the public cloud, and then a plurality of file blocks corresponding to the file to be stored are determined in the plurality of file blocks stored in the public cloud according to the identification information of each file block in the metadata; it should be understood that the storage manner of the specific file blocks stored on the public cloud may be flexibly adjusted according to the user requirement, and is not limited to the embodiment described above.

The mode of separately storing the identification information of the file block and the file block ensures the data security of the file data to be stored; and as the public cloud storage control can expand the capacity infinitely, the number of stored file blocks is too large, and the capacity can be conveniently expanded when the capacity needs to be expanded, so that the data storage capacity is improved.

By adopting the data storage method provided by the application, the file to be stored is sorted and divided into the metadata file and the file entity, wherein the metadata file is used for describing the identification information of a plurality of file blocks of the file to be stored and the basic information of the file to be stored, the metadata file is stored in the private cloud, and the file blocks of the file to be stored are stored in the public cloud, so that the metadata and the file entity are separately stored, the safety of data is ensured, and due to the expandability of a public cloud storage space, the expandability of the data is ensured on the basis of ensuring the safety of the data, and the possibility of data loss is avoided.

Optionally, on the basis of the foregoing embodiments, the embodiments of the present application may further provide a data storage method, and an implementation process of the foregoing method is described as follows with reference to the accompanying drawings. Fig. 4 is a schematic flowchart of a data storage method according to another embodiment of the present application, and as shown in fig. 4, before S101, the method may further include:

s105: and constructing a preset logic file system.

Fig. 5 is a schematic structural diagram of a preset logical file system according to an embodiment of the present application, and as shown in fig. 5, the preset logical file system is composed of a user root directory, sub-directories, and files; the preset logic file system comprises root directories corresponding to a plurality of users, each authorized user corresponds to one root directory, and subdirectories can be included under the root directory of the user; a plurality of files can be included under each sub-directory, and each file is a file pre-uploaded by a user.

As shown in fig. 5, the preset logical file system in fig. 5 includes root directories corresponding to 4 authorized users, which are root directory 1(/ usr1) corresponding to user 1, root directory 2(/ usr2) corresponding to user 2, root directory 3(/ usr3) corresponding to user 3, and root directory 4(/ usr4) corresponding to user 4.

The root directory 1 of the user 1 comprises two subdirectories and a file, namely a subdirectory 1(dir1), a subdirectory 2(dir2) and a file 1(file1), namely the user 1 directly uploads a file1 under the root directory 1 corresponding to the user 1 in a preset logic file system and establishes a subdirectory 1 and a subdirectory 2; wherein, the file is not included under the subdirectory 1(dir1), i.e. the user has not uploaded the file in the subdirectory 1(dir 1); two files, file 3(file3) and file 4(file4), are included under the subdirectory 2(dir2), i.e., user 1 has uploaded one file3 and one file4 into the subdirectory 2(dir2) under the root directory 1 of user 1 in the preset logical file system.

The root directory 2 of the user 2 comprises three subdirectories, namely a subdirectory 1(dir1), a subdirectory 2(dir2) and a subdirectory 3(dir3), namely the user 2 establishes a subdirectory 1, a subdirectory 2 and a subdirectory 3 under the root directory 2 corresponding to the user 2 in the preset logical file system; each sub-directory does not include the file uploaded by the user 2, that is, the user 2 does not upload the file to the preset logical file system temporarily, and does not upload the file to a plurality of sub-directories corresponding to the user 2.

The root directory 3 of the user 3 includes no subdirectories, but the root node 3 includes three files, which are: file 1(file1), file 2(file2), and file 3(file3), that is, the user 3 has uploaded one file1, one file2, and one file3 directly to the root directory 3 of the user 3 in the preset logical file system, that is, the user 3 has not established a sub-directory in the preset logical file system, but the user 3 has uploaded 3 files directly to the root directory 3 corresponding to the user 3 in the preset logical file system.

The root directory 4 of the user 4 does not include sub-directories and also does not include files, that is, the user 4 does not currently create sub-directories under the root directory 4 in the preset logical file system, nor uploads files under the root directory 4 of the preset logical file system.

The subdirectory can be created by the user according to the requirement, wherein only an authorized user can establish the subdirectory in a preset logic file system or upload files, and the name of the subdirectory can be preset by the user, automatically generated by the system or automatically modified by the user on the basis of the original name; the generation logic of a subdirectory may be, for example: according to the type of the stored file, creating subdirectories corresponding to a plurality of file types under a root directory, or creating subdirectories corresponding to a plurality of topics under the root directory according to the topics corresponding to each file, it should be understood that the above embodiments are merely illustrative, and the creating manner and creating logic of the specific subdirectories may be flexibly adjusted according to the user needs, and are not limited to the embodiments described above.

S106: and acquiring a file to be stored uploaded to a preset logic file system by a user.

When a user initially uploads a file to be stored, the user can send an upload instruction to a preset logic file system, wherein the upload instruction can include the file to be stored and target root node information specified by the user, and then the preset logic file system can determine a target child node from a plurality of child nodes under the root node of the current user according to the target root node information in the upload instruction of the user and pre-store the file to be stored under a target child directory corresponding to the user; then, after generating a corresponding metadata file according to the file to be stored, storing the metadata file to a directory position corresponding to the file to be stored in a preset logic file system, namely, according to a position in the preset logic file system where the content of the file to be stored is originally placed, replacing the file to be stored with the metadata file corresponding to the file to be stored in the position where the file to be stored is originally placed in a target subdirectory, and storing a document entity in a public cloud storage through a storage-gateway of a private cloud, namely, after the user sends an upload instruction, the preset logic file system stores the file to be stored in the upload instruction sent by the user in the public cloud, and stores the metadata file corresponding to the file to be stored in the target subdirectory under a root node of the user in the preset logic file system.

Optionally, on the basis of the foregoing embodiments, the embodiments of the present application may further provide a data storage method, and an implementation process of the foregoing method is described as follows with reference to the accompanying drawings. Fig. 6 is a schematic flowchart of a data storage method according to another embodiment of the present application, as shown in fig. 6, before S102, the method may further include:

s107: and carrying out data encryption processing on each file block to obtain the encrypted file block.

In some possible embodiments, the data encryption process may include, for example: data encryption or data obfuscation; the method comprises the following steps of carrying out data encryption processing on each file block through a preset data encryption algorithm, or carrying out confusion processing on each file block through a preset data confusion algorithm; therefore, the data security of the file data is improved, it should be understood that the above embodiment is only an exemplary illustration, and the processing mode included in the specific data encryption processing may be flexibly adjusted according to the user's needs, and is not limited to the foregoing embodiment.

S108: and determining a target container of each file block in the public cloud according to the identification information of each file block, and writing the encrypted file block into the corresponding target container.

In the embodiment of the application, the determining of the target container may be, for example, performing hash (hash) calculation according to identification information (block-id) of a file block, converting the identification information of the file block into unique integer information according to a consistent hash algorithm, determining a target container (bucket) corresponding to each file block and identification information of the target container according to a result of the hash calculation, then sending, by the storage gateway, the encrypted file block content and the identification of the target container to the public cloud through a preset access interface in a unified manner, so that the public cloud writes the encrypted file block into the target container corresponding to the identification of the target container according to the received data, and names of the file blocks in the target container are named as the identification information (block-id) of the file block until a plurality of file blocks of a file to be stored are all stored in the target container for use, and finishing the storage of a plurality of file blocks of the file to be stored.

The setting mode enables that when other users download files in the follow-up process, a download request can be sent to the private cloud, then the private cloud uniquely determines a target metadata file corresponding to the identification information of the current data file in a plurality of metadata files stored in the private cloud according to the identification information (file-id) of the metadata file in the download request, then searches in a plurality of containers stored in the public cloud according to the identification information (block-id) of each file block recorded in the target metadata file, determines a container which is named in accordance with the identification information (block-id) of each file block recorded in the target metadata file in the plurality of containers as a target container, reads the content in each target container, and then determines the sequential relation among the identification information in the identification information list according to the identification information list stored in the metadata file, and sequentially splicing the read contents of the target containers according to the sequence relation among the identification information, and determining that the spliced file is the downloaded complete file content.

The hash selection algorithm in the present application selects the consistent hash algorithm in order to support the change of the number of packets at the same time, and it should be understood that the above embodiment is only an exemplary description, and the specific type of the hash selection algorithm may be flexibly adjusted according to the user's needs, and is not limited to the above embodiment.

The consistent hash algorithm is a special hash algorithm, and when a server is removed or added, the mapping relationship between the existing service request and the server processing the request can be changed as little as possible. The method has the advantages of better expandability, better adaptation to rapid growth of data and the like. The consistent hash algorithm ensures that the change of data storage is minimum no matter servers are increased or decreased, so that compared with the traditional hash algorithm, the consistent hash algorithm can greatly save the expense of data movement; in addition, the data are distributed by adopting a consistent hash algorithm, when the data continuously increase, part of virtual nodes may contain a lot of data, so that the data are distributed unevenly on the virtual nodes, at the moment, the virtual nodes containing a lot of data can be split, and the splitting only divides the original virtual nodes into two parts without re-hashing and dividing all the data. After the virtual nodes are split, if the load of the physical servers is still unbalanced, only the storage distribution of part of the virtual nodes needs to be adjusted among the servers. This dynamically expands the number of physical servers as the data grows, at a cost much less than traditional hash algorithms that redistribute all the data. Therefore, the consistent Hash can solve the problems of dynamic scaling and the like of a simple Hash algorithm in a Distributed Hash Table (DHT).

By adopting the data storage method provided by the application, the metadata file and the document block of each file to be stored are separately stored on the private cloud and the public cloud, so that the data storage safety is improved, and meanwhile, the document blocks are stored on the public cloud in a scattered and encrypted manner, so that the data storage safety is further ensured; in addition, the method provided by the application also supports unlimited capacity expansion by utilizing a public cloud storage space, namely resources on demand can be provided, the service requirement is met, and the data is stored more safely and reliably; the method for safely storing the document content on the public cloud and the method for normally accessing the document on the private cloud are provided based on the advantages of data safety of the private cloud and low cost and high reliability of the public cloud.

The following explains a data storage device provided in the present application with reference to the drawings, where the data storage device can execute any one of the data storage methods shown in fig. 1 to 6, and specific implementation and beneficial effects of the data storage device refer to the above descriptions, and are not described again below.

Fig. 7 is a schematic structural diagram of a data storage device according to an embodiment of the present application, and as shown in fig. 7, the data storage device includes: a splitting module 201, a generating module 202 and a transmitting module 203, wherein:

a splitting module 201, configured to split a file to be stored into multiple file blocks;

a generating module 202, configured to generate metadata corresponding to a file to be stored according to the identification information of each file block; generating and storing a metadata file corresponding to a file to be stored according to the metadata and other file information;

the transmission module 203 is configured to transmit each file block of the file to be stored to the public cloud storage for storage, so that the user downloads the file to be stored on the public cloud storage according to the metadata file.

Optionally, the splitting module 201 is specifically configured to split the file to be stored into a plurality of file blocks according to a preset file block size.

Optionally, the generating module 202 is specifically configured to generate identification information corresponding to each file block according to a preset digest algorithm.

Optionally, the generating module 202 is specifically configured to generate preset identification information corresponding to the metadata file according to a preset digest algorithm.

Optionally, on the basis of the above embodiments, the embodiments of the present application may further provide a data storage device, and an implementation process of the device shown in fig. 7 is described as follows with reference to the accompanying drawings. Fig. 8 is a schematic structural diagram of a data storage device according to another embodiment of the present application, and as shown in fig. 8, the data storage device further includes: a building module 204 and an obtaining module 205, wherein:

a constructing module 204, configured to construct a preset logic file system;

the obtaining module 205 is configured to obtain a file to be stored, which is uploaded to a preset logic file system by a user.

As shown in fig. 8, the apparatus further includes: the storage module 206 is specifically configured to store the metadata in a directory location corresponding to a file to be stored uploaded by a user in a preset logical file system.

As shown in fig. 8, the apparatus further includes: a determination module 207, wherein:

the obtaining module 205 is specifically configured to perform data encryption processing on each file block, and obtain the file block after the data encryption processing, where the data encryption processing includes: data encryption or data obfuscation;

the determining module 207 is configured to determine a target container of each file block in the public cloud according to the identification information of each file block, and write the encrypted file block into the corresponding target container.

The above-mentioned apparatus is used for executing the method provided by the foregoing embodiment, and the implementation principle and technical effect are similar, which are not described herein again.

These above modules may be one or more integrated circuits configured to implement the above methods, such as: one or more Application Specific Integrated Circuits (ASICs), or one or more microprocessors, or one or more Field Programmable Gate Arrays (FPGAs), etc. For another example, when one of the above modules is implemented in the form of a Processing element scheduler code, the Processing element may be a general-purpose processor, such as a Central Processing Unit (CPU) or other processor capable of calling program code. For another example, these modules may be integrated together and implemented in the form of a system-on-a-chip (SOC).

Fig. 9 is a schematic structural diagram of a data storage device according to an embodiment of the present application, where the data storage device may be integrated in a terminal device or a chip of the terminal device.

As shown in fig. 9, the data storage device includes: a processor 501, a storage medium 502, and a bus 503.

The processor 501 is used for storing a program, and the processor 501 calls the program stored in the storage medium 502 to execute the method embodiment corresponding to fig. 1-6. The specific implementation and technical effects are similar, and are not described herein again.

Optionally, the present application also provides a program product, such as a storage medium, on which a computer program is stored, including a program, which, when executed by a processor, performs embodiments corresponding to the above-described method.

In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, a division of a unit is merely a logical division, and an actual implementation may have another division, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

Units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.

The integrated unit implemented in the form of a software functional unit may be stored in a computer readable storage medium. The software functional unit is stored in a storage medium and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device) or a processor (processor) to perform some steps of the methods according to the embodiments of the present application. And the aforementioned storage medium includes: a U disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

Claims

1. A data storage method is applied to a storage gateway of a private cloud, and the method comprises the following steps of;

splitting a file to be stored into a plurality of file blocks;

2. The method of claim 1, wherein the splitting the file to be stored into a plurality of file blocks comprises:

3. The method according to claim 1, wherein before generating the metadata corresponding to the file to be stored according to the identification information of each file block, the method further comprises:

4. The method according to claim 1, wherein after generating the metadata file corresponding to the file to be stored according to the metadata and other file information, the method further comprises:

5. The method of claim 1, wherein prior to splitting the file to be stored into the plurality of file blocks, the method further comprises:

constructing a preset logic file system;

6. The method according to claim 5, wherein after the generating the metadata corresponding to the file to be stored according to the identification information of each file block, the method further comprises:

7. The method according to claim 1, wherein before generating the metadata corresponding to the file to be stored according to the identification information of each file block, the method further comprises:

8. A data storage device, characterized in that the device comprises: split module, generation module and transmission module, wherein:

9. A data storage device, the device comprising: a processor, a storage medium and a bus, the storage medium storing machine-readable instructions executable by the processor, the processor and the storage medium communicating via the bus when the data storage device is operating, the processor executing the machine-readable instructions to perform the method of any of claims 1-7.

10. A storage medium, characterized in that the storage medium has stored thereon a computer program which, when being executed by a processor, performs the method of any of the preceding claims 1-7.