CN110413588B - Distributed object storage method and device, computer equipment and storage medium - Google Patents

Distributed object storage method and device, computer equipment and storage medium Download PDF

Info

Publication number
CN110413588B
CN110413588B CN201910693969.4A CN201910693969A CN110413588B CN 110413588 B CN110413588 B CN 110413588B CN 201910693969 A CN201910693969 A CN 201910693969A CN 110413588 B CN110413588 B CN 110413588B
Authority
CN
China
Prior art keywords
type
file
object storage
files
type file
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910693969.4A
Other languages
Chinese (zh)
Other versions
CN110413588A (en
Inventor
张艺
张学舟
林丹
韩霜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Industrial and Commercial Bank of China Ltd ICBC
Original Assignee
Industrial and Commercial Bank of China Ltd ICBC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Industrial and Commercial Bank of China Ltd ICBC filed Critical Industrial and Commercial Bank of China Ltd ICBC
Priority to CN201910693969.4A priority Critical patent/CN110413588B/en
Publication of CN110413588A publication Critical patent/CN110413588A/en
Application granted granted Critical
Publication of CN110413588B publication Critical patent/CN110413588B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/178Techniques for file synchronisation in file systems
    • G06F16/1794Details of file format conversion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/06Protocols specially adapted for file transfer, e.g. file transfer protocol [FTP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides a distributed object storage method, a device, computer equipment and a storage medium, wherein the method comprises the following steps: receiving a data uploading request, wherein the data uploading request carries a first type file; responding to a data uploading request, converting the first type file into a second type file, wherein the difference value between the byte number of the second type file and the byte number of the first type file is larger than a preset threshold value; and uploading the second type file to a distributed object storage system for object storage. The distributed object storage method enables the distributed object storage system to be suitable for storing files with any size, and effectively improves the flexibility and the expandability of the distributed object storage system.

Description

Distributed object storage method and device, computer equipment and storage medium
Technical Field
The present application relates to the field of computer technologies, and in particular, to a distributed object storage method and apparatus, a computer device, and a storage medium.
Background
Object storage is a technology often used in the internet, and is different from file storage, and objects are not organized in a directory hierarchy structure. Each object is located at the same level of a flat space called a storage pool, and each element at each level has a unique identification in the storage system by which a user accesses a container or object. In object storage, nested folders are generally discarded using a flat data organization structure, thereby avoiding the maintenance of a large directory tree.
At present, the object storage usually adopts a distributed storage mode. However, distributed object storage has some disadvantages, is not suitable for storing files of any size, and has poor flexibility and expansibility.
In view of the above problems, no effective solution has been proposed.
Disclosure of Invention
The embodiment of the application provides a distributed object storage method and device, computer equipment and a storage medium, and aims to solve the problem that distributed object storage in the prior art is not suitable for storing files of any size.
The embodiment of the application provides a distributed object storage method, which comprises the following steps: receiving a data uploading request, wherein the data uploading request carries a first type file; responding to a data uploading request, converting the first type file into a second type file, wherein the difference value between the byte number of the second type file and the byte number of the first type file is larger than a preset threshold value; and uploading the second type file to a distributed object storage system for object storage.
In one embodiment, the data uploading request carries a plurality of first type files and metadata of each first type file in the plurality of first type files, and the number of bytes of each first type file is less than a first preset number of bytes; accordingly, converting the first type of file into a second type of file includes: acquiring metadata of each first type file in a plurality of first type files; merging the plurality of first type files into a second type file, and recording the positioning information of each first type file in the plurality of first type files in the second type file; and generating an index file according to the metadata and the positioning information of each first type file.
In one embodiment, uploading the second type of file to a distributed object storage system for object storage includes: uploading the second type file to a distributed file system of a distributed object storage system for object storage; and uploading the index file to a distributed database of a distributed object storage system for storage.
In one embodiment, the data upload request further carries a third preset byte number and metadata of the first type file, and the byte number of the first type file is greater than the second preset byte number; accordingly, converting the first type of file into a second type of file includes: dividing the first type file into a plurality of second type files according to a third preset byte number, and recording the offset of each second type file in the plurality of second type files, wherein the byte number of each second type file is the third preset byte number; and generating the index file of each second type file according to the offset of each second type file and the metadata of the first type file.
In one embodiment, uploading the second type of file to a distributed object storage system for object storage includes: uploading a plurality of second type files to a distributed file system of a distributed object storage system for object storage, and recording service metadata generated in the uploading process; uploading the index file of each second type file to a distributed database of a distributed object storage system for storage; generating a control object according to the metadata of the first type file, the service metadata and the attribute information of each second type file, wherein the attribute information of the second type files comprises the number of the second type files and a third preset byte number; and uploading the control object to a distributed file system of a distributed object storage system for object storage.
In one embodiment, uploading a plurality of files of the second type to a distributed file system of a distributed object storage system for object storage comprises: randomly generating a universal unique identification code; generating key values of the second type files according to the universal unique identification codes and the offset of the second type files; and uploading each second type file in the plurality of second type files, the key value of each second type file and the metadata of the first type file to a distributed object storage system for object storage.
In one embodiment, generating the control object according to the metadata of the first type file, the service metadata and the attribute information of each second type file includes: generating target metadata according to the metadata of the first type file and the service metadata; and generating a control object according to the target metadata, the universal unique identification code and the attribute information of the second type file.
In one embodiment, uploading the plurality of files of the second type to a distributed file system of a distributed object storage system for object storage further comprises: generating error information when errors are transmitted out, wherein the error information carries a universal unique identification code; verifying the uploaded second type file according to the universal unique identification code to determine the second type file which fails to be uploaded; and the second type file which fails to be uploaded is uploaded to the distributed file system again for object storage.
An embodiment of the present application further provides a distributed object storage apparatus, including: the receiving module is used for receiving a data uploading request, wherein the data uploading request carries a first type file; the conversion module is used for responding to a data uploading request and converting the first type file into a second type file, wherein the difference value between the byte number of the second type file and the byte number of the first type file is larger than a preset threshold value; and the uploading module is used for uploading the second type file to the distributed object storage system for object storage.
The embodiments of the present application further provide a computer device, which includes a processor and a memory for storing processor-executable instructions, where the processor executes the instructions to implement the steps of the distributed object storage method described in any of the above embodiments.
Embodiments of the present application further provide a computer-readable storage medium, on which computer instructions are stored, and when executed, the instructions implement the steps of the distributed object storage method described in any of the above embodiments.
In the embodiment of the application, a distributed object storage method is provided, and is used for receiving a data uploading request, converting a first type file in the data uploading request into a second type file, and uploading the generated second type file to a distributed object storage system for object storage. In the above manner, the first type file which is not suitable for being directly uploaded to the distributed object storage system for object storage is converted into the second type file which is suitable for being directly uploaded to the distributed object storage system, and then the second type file is uploaded to the distributed object storage system for object storage, so that the distributed object storage system can store files with any size, and the flexibility and the expandability of the distributed object storage system are effectively improved. By the scheme, the technical problem that the existing distributed object storage system is not suitable for storing files with any size is solved, the files with any size are stored, and the technical effects of effectively improving the flexibility and the expansibility of the system are achieved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application, are incorporated in and constitute a part of this application, and are not intended to limit the application. In the drawings:
fig. 1 is a schematic diagram illustrating an application scenario of a distributed object storage method in an embodiment of the present application;
FIG. 2 is a flow diagram illustrating a distributed object storage method in one embodiment of the present application;
FIG. 3 is a schematic structural diagram illustrating a control object and a second type file in the distributed object storage method in an embodiment of the present application;
FIG. 4 shows a schematic diagram of a distributed object store in an embodiment of the present application;
fig. 5 shows a schematic diagram of a computer device in an embodiment of the application.
Detailed Description
The principles and spirit of the present application will be described with reference to a number of exemplary embodiments. It should be understood that these embodiments are given solely for the purpose of enabling those skilled in the art to better understand and to practice the present application, and are not intended to limit the scope of the present application in any way. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
As will be appreciated by one skilled in the art, embodiments of the present application may be embodied as a system, apparatus, device, method or computer program product. Accordingly, the present disclosure may be embodied in the form of: entirely hardware, entirely software (including firmware, resident software, micro-code, etc.), or a combination of hardware and software.
In view of the fact that the existing distributed object storage is not suitable for storing files with any size, the inventor finds through research that the distributed object storage system can store files with any size by converting a first type of file which is not suitable for being directly uploaded to the distributed object storage system for object storage into a second type of file which is suitable for being directly uploaded to the distributed object storage system for object storage, and then uploading the second type of file to the distributed object storage system for object storage.
Based on this, an embodiment of the present application provides a distributed object storage method, and fig. 1 illustrates a schematic diagram of an application scenario of the distributed object storage method provided in an embodiment of the present application. As shown in fig. 1, a client sends a data upload request to a server, where the data upload request carries a first type file to be uploaded. And after receiving the data uploading request, the server converts the first type file in the data uploading request into a second type file, and uploads the generated second type file to a distributed object storage system for object storage.
The client may be a desktop computer, a notebook, a mobile phone terminal, a PDA, etc., as long as the client is a device capable of displaying content to a user or a service person and sending a data upload request to a server, and the presentation of the client is formed, which is not limited in the present application. The server may be a single server, a server cluster, or a cloud server, and the specific composition of the server may form the present application without limitation. The distributed object storage system may be a Hadoop object storage system or other distributed object storage systems.
The difference value between the byte number of the first type file and the byte number of the second type file is larger than a preset threshold value, namely, the difference value between the size of the first type file and the size of the second type file is larger. The second type of file may be a file suitable for direct upload to a distributed object storage system. The first type files may be small files, and if the small files are separately stored in the distributed file system HDFS, hardware resources are greatly wasted, and thus, a plurality of first type files may be merged into a second type file. The first type of file may also be a very large file, where the distributed object storage system is unable to store the very large file. Therefore, the first type file can be converted into a plurality of second type files, and then the plurality of second type files are uploaded to the distributed object storage system for object storage.
Fig. 2 is a flowchart illustrating a distributed object storage method according to an embodiment of the present application. Although the present application provides method operational steps or apparatus configurations as illustrated in the following examples or figures, more or fewer operational steps or modular units may be included in the methods or apparatus based on conventional or non-inventive efforts. In the step or structure in which the necessary cause and effect relationship does not logically exist, the execution sequence of the steps or the module structure of the apparatus is not limited to the execution sequence or the module structure described in the embodiment of the present application and shown in the drawings. When the described method or module structure is applied in an actual device or end product, the method or module structure according to the embodiments or shown in the drawings can be executed sequentially or executed in parallel (for example, in a parallel processor or multi-thread processing environment, or even in a distributed processing environment).
Specifically, as shown in fig. 2, a distributed object storage method provided in an embodiment of the present application may include the following steps:
step S201, a data upload request is received, where the data upload request carries a first type file.
The server can receive a data uploading request sent by the client. The data uploading request carries a first type file. Wherein the first type of file may be a file that is not suitable for direct upload to the distributed object store.
Step S202, in response to the data upload request, converting the first type file into a second type file, where a difference between the number of bytes of the second type file and the number of bytes of the first type file is greater than a preset threshold.
After receiving the data upload request, the server may convert the first type of file into a second type of file suitable for direct upload to the distributed object storage system in response to the data upload request. The difference value between the byte number of the first type file and the byte number of the second type file is larger than a preset threshold value, namely, the difference between the size of the first type file and the size of the second type file is larger.
And step S203, uploading the second type file to a distributed object storage system for object storage.
After converting the first type of file into the second type of file, the second type of file may be uploaded to a distributed object storage system for object storage.
In the above manner, the first type file which is not suitable for being directly uploaded to the distributed object storage system for object storage is converted into the second type file which is suitable for being directly uploaded to the distributed object storage system, and then the second type file is uploaded to the distributed object storage system for object storage, so that the distributed object storage system can store files with any size, and the flexibility and the expandability of the distributed object storage system are effectively improved. By the scheme, the technical problem that the existing distributed object storage system is not suitable for storing files with any size is solved, the files with any size are stored, and the technical effects of effectively improving the flexibility and the expansibility of the system are achieved.
In some embodiments of the present application, the data upload request may carry a plurality of first type files and metadata of each of the plurality of first type files, where a byte count of each of the first type files is smaller than a first preset byte count. Accordingly, converting the first type of file to the second type of file may include: acquiring metadata of each first type file in a plurality of first type files; merging the plurality of first type files into a second type file, and recording the positioning information of each first type file in the plurality of first type files in the second type file; and generating an index file according to the metadata and the positioning information of each first type file.
The first preset byte number can be determined according to system parameters and actual requirements. And under the condition that the byte number of the first type file is smaller than a first preset byte number, determining that the first type file is a small file. When the first type file is a small file, a plurality of first type files may be merged into a second type file, and then the second type file is uploaded to the distributed object storage system for object storage. The data uploading request carries a plurality of first type files and metadata of each first type file in the plurality of first type files. The metadata is some parameter information about the file, for example, the metadata may include creation time of the file, file size, and the like. And combining the plurality of first type files into a second type file, and recording the positioning information of each first type file in the plurality of first type files in the second type file. The positioning information may be information such as an offset of the first type file in the second type file. And generating an index file according to the metadata and the positioning information of the first type file. By the mode, the small files can be combined into the second type file suitable for being directly uploaded to the distributed object storage system and then the second type file is stored, so that hardware resources can be saved, and the resource utilization rate can be improved.
Further, in some embodiments of the present application, uploading the second type file to the distributed object storage system for object storage may include: uploading the second type file to a distributed file system of a distributed object storage system for object storage; and uploading the index file to a distributed database of a distributed object storage system for storage.
In particular, the distributed object storage system may include a distributed file system and a distributed database. The Distributed File System may include an HDFS (Hadoop Distributed File System). The HDFS is a highly fault-tolerant distributed file system, is suitable for being deployed on cheap machines, can provide high-throughput data access, and is very suitable for file storage on large-scale data sets. The distributed Database may include Hbase (Hadoop Database). Hbase is a high-reliability, high-performance, nematic-oriented and scalable distributed storage system, and can support real-time ultra-large scale random access. And combining a plurality of small first type files into a second type file and uploading the second type file to a distributed file system for object storage. An index file formed by metadata and positioning information of a plurality of first type files is stored in a distributed database. When reading the first type file, the user may first read the information in the index file from the HBase, then read the first type file from the corresponding position in the HDFS according to the information in the index file, and return the first type file to the user. By the mode, the storage space can be saved, and the time consumption for reading can be reduced.
In some embodiments of the present application, the data upload request may further carry a third preset number of bytes and metadata of the first type file, where the number of bytes of the first type file is greater than the second preset number of bytes. Accordingly, converting the first type of file to the second type of file may include: dividing the first type file into a plurality of second type files according to a third preset byte number, and recording the offset of each second type file in the plurality of second type files, wherein the byte number of each second type file is the third preset byte number; and generating the index file of each second type file according to the offset of each second type file and the metadata of the first type file.
The second preset byte number can be determined according to system parameters and actual requirements. And under the condition that the byte number of the first type file is larger than the second preset byte number, determining the first type file as the super large file. When the first type file is a super large file, the data uploading request comprises the first type file, metadata of the first type file and a third preset byte number. The metadata is some parameter information about the file, for example, the metadata may include creation time of the file, file size, and the like. And the third preset byte number is smaller than the second preset byte number and is the file size suitable for directly uploading the distributed storage system to store the object. The first type file can be divided into a plurality of second type files according to a third preset byte number, and the offset of each second type file of the plurality of second type files is recorded. The byte number of the second type file is the third preset byte number. The index file of each second-type file may be generated according to the offset of each second-type file and the metadata of the first-type file. By the mode, the oversized files can be cut into a plurality of second type files suitable for being directly uploaded to the distributed object storage system to be stored, and the index files of the second type files are generated, so that the distributed object storage system can store the oversized files.
Further, in some embodiments of the present application, uploading the second type file to the distributed object storage system for object storage may include: uploading a plurality of second type files to a distributed file system of a distributed object storage system for object storage, and recording service metadata generated in the uploading process; uploading the index file of each second type file to a distributed database of a distributed object storage system for storage; generating a control object according to the metadata of the first type file, the service metadata and the attribute information of each second type file, wherein the attribute information of the second type files comprises the number of the second type files and a third preset byte number; and uploading the control object to a distributed file system of a distributed object storage system for object storage.
In particular, the distributed object storage system may include a distributed file system and a distributed database. The distributed file system may include, among other things, an HDFS. The HDFS is a highly fault-tolerant distributed file system, is suitable for being deployed on cheap machines, can provide high-throughput data access, and is very suitable for file storage on large-scale data sets. The distributed database may include Hbase. Hbase is a distributed database with high reliability, high performance, column orientation and scalability, and can support real-time ultra-large scale random access. And dividing the oversized first type file into a plurality of second type files, and uploading the second type files to a distributed file system for object storage. And storing an index file formed by the metadata and the offset of the plurality of files of the second type in a distributed database. And recording the generated service metadata in the process of uploading the plurality of second type files to the distributed object storage system. The service metadata may include some information accompanying the uploading, for example, information such as an uploading author and an uploading organization. The control object may be generated according to the metadata of the first type file, the service metadata, and the attribute information of each second type file. The attribute information of the second-type file includes the total number of the plurality of second-type files and a third preset number of bytes (i.e., the size of the second-type file). And uploading the control object to a distributed file system of a distributed object storage system for object storage. When reading the first type file, the user may first read the control object from the distributed file system, and then list information of all the second type files according to the control object. And determining which second type files the content to be downloaded is contained in according to the starting and ending positions in the downloading request, acquiring the second type files from the distributed file system, and assembling the second type files into an input stream. The input stream is returned to the client, which can read the data from the input stream and save it locally. By the aid of the method, uploading and downloading of the super-large files can be achieved, flexibility and expandability of the distributed storage system are effectively improved, and time consumption for reading is reduced.
Further, in some embodiments of the present application, uploading a plurality of files of the second type to a distributed file system of a distributed object storage system for object storage may include: randomly generating a universal unique identification code; generating key values of the second type files according to the universal unique identification codes and the offset of the second type files; and uploading each second type file in the plurality of second type files, the key value of each second type file and the metadata of the first type file to a distributed object storage system for object storage.
When a plurality of second-type files are uploaded to a distributed object storage system for object storage, key values and metadata of the second-type files and the second-type files need to be subjected to object storage. Wherein the metadata of the second type file may be metadata of the first type file. The key value of each second-type file can be determined according to the offset of each second-type file. Illustratively, a Universally Unique Identifier (UUID) is randomly generated, and then a key value of each second-type file is generated according to the UUID and an offset of each second-type file. By the mode, the key values of the second type files can be conveniently and quickly generated, and the second type files are subjected to object storage.
Further, in some embodiments of the present application, generating a control object according to the metadata of the first type file, the service metadata, and the attribute information of each second type file may include: generating target metadata according to the metadata of the first type file and the service metadata; and generating a control object according to the target metadata, the universal unique identification code and the attribute information of the second type file.
Specifically, the target metadata may be generated according to metadata of the first type file and service metadata generated in an uploading process, and then the control object may be generated according to the target metadata, the universal unique identification code, and attribute information of the second type file. After the control object is generated, the control object is uploaded to a distributed file system for object storage. By the mode, the control object storing the related information of each second type file can be generated, and subsequent reading is facilitated.
As shown in fig. 3, the KEY of the control object is equal to the KEY of the first type file. The control object stores metadata, UUID and attribute information of each second type file. And the key value key of each second type file in the second type files is the offset scope _ i of the UUID and each second type file.
In the case that the first type file is an oversized file, the first type file is divided into a plurality of second type files, and errors may occur in the uploading process of the plurality of second type files. And under the condition of uploading error, the server can generate error information, wherein the error information carries the UUID, the uploaded second type file is verified according to the UUID, the second type file which fails to be uploaded can be found, and the second type file is uploaded again. Therefore, in some embodiments of the present application, uploading a plurality of files of the second type to a distributed file system of a distributed object storage system for object storage may further include: generating error information when errors are transmitted out, wherein the error information carries a universal unique identification code; verifying the uploaded second type file according to the universal unique identification code to determine the second type file which fails to be uploaded; and the second type file which fails to be uploaded is uploaded to the distributed file system again for object storage.
And generating error information when errors occur in the uploading process of the plurality of second type files. The error information carries the UUID. And checking the uploaded second type file according to the UUID to find the second type file which fails to be uploaded. And uploading the second type files which are failed to be uploaded to the distributed file system again for object storage, checking all the second type files after the second type files are finished, sending a completion command after the second type files are confirmed to be correct, updating the metadata and updating the control object. For example, it may not be allowed to replace the ordinary upload interface with the breakpoint resume interface, that is, the upload exception does not occur, but the breakpoint resume interface is directly used, which the server controls. However, when the breakpoint resume interface scans to find that all the second type files have been uploaded in fact successfully, it is not considered that this is an error, but merely skips the second type file upload flow and reflects this in the boolean return value. By the mode, breakpoint continuous transmission can be supported, and accuracy and efficiency of large file uploading are improved.
The above method is described below with reference to two specific examples, however, it should be noted that the specific examples are only for better illustrating the present application and should not be construed as limiting the present application.
In one embodiment, the distributed object storage method comprises the following steps:
step 1, a client sends a data uploading request to a server, wherein the data uploading request carries a plurality of first type files and metadata of each first type file, and the number of bytes of each first type file is smaller than a first preset number of bytes and is a small file;
step 2, the server responds to the data uploading request, obtains metadata of each first type file, merges the first type files into a second type file, records positioning information of each first type file in the first type files in the second type file, and generates an index file according to the metadata and the positioning information of each first type file;
and 3, uploading the second type file to a distributed file system of the distributed object storage system by the server for object storage, and uploading the index file to a distributed database of the distributed object storage system for storage.
In another embodiment, a distributed object storage method comprises the steps of:
step 1, a client sends a data uploading request to a server, wherein the data uploading request carries a first type file, metadata of the first type file and a third preset byte number, and the byte number of the first type file is larger than the second preset byte number and is an ultra-large file;
step 2, the server responds to the received request, divides the first type file into a plurality of second type files according to a third preset byte number, records the offset of each second type file in the plurality of second type files, and generates an index file of each second type file according to the offset of each second type file and the metadata of the first type file;
step 3, the server uploads the plurality of second type files to a distributed file system of the distributed object storage system for object storage, records service metadata generated in the uploading process, specifically generates a universal unique identification code randomly, generates a key value of each second type file according to the universal unique identification code and the offset of each second type file, and uploads each second type file in the plurality of second type files, the key value of each second type file and the metadata of the first type file to the distributed object storage system for object storage;
step 4, the server uploads the index file of each second type file to a distributed database of the distributed object storage system for storage;
step 5, the server generates target metadata according to the metadata and the service metadata of the first type files, and generates control objects according to the target metadata, the universal unique identification code and the attribute information of the second type files, wherein the attribute information of each second type file comprises the size of each second type file and the total number of the second type files;
and 6, the server uploads the control object to a distributed file system of the distributed object storage system for object storage.
The distributed object storage methods in the two embodiments respectively convert the oversized file and the small file which are not suitable for being directly uploaded to the distributed object storage system for object storage into the second type file which is suitable for being directly uploaded to the distributed object storage system for object storage, and then upload and store the second type file, so that the distributed object storage system can store files with any size, and the flexibility and the expandability of the distributed object storage system are effectively improved. By the scheme, the technical problem that the existing distributed object storage system is not suitable for storing files with any size is solved, the files with any size are stored, and the technical effects of effectively improving the flexibility and the expansibility of the system are achieved.
Based on the same inventive concept, embodiments of the present application further provide a distributed object storage apparatus, as described in the following embodiments. Because the principle of the distributed object storage device for solving the problem is similar to that of the distributed object storage method, the implementation of the distributed object storage device can refer to the implementation of the distributed object storage method, and repeated details are not repeated. As used hereinafter, the term "unit" or "module" may be a combination of software and/or hardware that implements a predetermined function. Although the means described in the embodiments below are preferably implemented in software, an implementation in hardware, or a combination of software and hardware is also possible and contemplated. Fig. 4 is a block diagram of a distributed object storage apparatus according to an embodiment of the present application, and as shown in fig. 4, the distributed object storage apparatus includes: a receiving module 401, a converting module 402 and an uploading module 403, the structure of which will be described below.
The receiving module 401 is configured to receive a data upload request, where the data upload request carries a first type file.
The conversion module 402 is configured to convert the first type file into a second type file in response to the data upload request, where a difference between the number of bytes of the second type file and the number of bytes of the first type file is greater than a preset threshold.
The uploading module 403 is configured to upload the second type file to the distributed object storage system for object storage.
In some embodiments of the present application, the data upload request may carry a plurality of first type files and metadata of each of the plurality of first type files, where the number of bytes of the first type file is less than a first preset number of bytes; accordingly, the conversion module may be specifically configured to: acquiring metadata of each first type file in a plurality of first type files; merging the plurality of first type files into a second type file, and recording the positioning information of each first type file in the plurality of first type files in the second type file; and generating an index file according to the metadata and the positioning information of each first type file.
In some embodiments of the present application, the upload module may be specifically configured to: uploading the second type file to a distributed file system of a distributed object storage system for object storage; and uploading the index file to a distributed database of a distributed object storage system for storage.
In some embodiments of the present application, the data upload request may further carry a third preset number of bytes and metadata of the first type file, where the number of bytes of the first type file is greater than the second preset number of bytes; accordingly, the conversion module may be specifically configured to: dividing the first type file into a plurality of second type files according to a third preset byte number, and recording the offset of each second type file in the plurality of second type files, wherein the byte number of each second type file is the third preset byte number; and generating the index file of each second type file according to the offset of each second type file and the metadata of the first type file.
In some embodiments of the present application, the upload module may be specifically configured to: uploading a plurality of second type files to a distributed file system of a distributed object storage system for object storage, and recording service metadata generated in the uploading process; uploading the index file of each second type file to a distributed database of a distributed object storage system for storage; generating a control object according to the metadata of the first type file, the service metadata and the attribute information of each second type file, wherein the attribute information of the second type files comprises the number of the second type files and a third preset byte number; and uploading the control object to a distributed file system of a distributed object storage system for object storage.
In some embodiments of the present application, uploading a plurality of files of the second type to a distributed file system of a distributed object storage system for object storage may include: randomly generating a universal unique identification code; generating key values of the second type files according to the universal unique identification codes and the offset of the second type files; and uploading each second type file in the plurality of second type files, the key value of each second type file and the metadata of the first type file to a distributed object storage system for object storage.
In some embodiments of the present application, generating a control object according to the metadata of the first type file, the service metadata, and the attribute information of each second type file may include: generating target metadata according to the metadata of the first type file and the service metadata; and generating a control object according to the target metadata, the universal unique identification code and the attribute information of the second type file.
In some embodiments of the present application, uploading a plurality of files of the second type to a distributed file system of a distributed object storage system for object storage, may further include: generating error information when errors are transmitted out, wherein the error information carries a universal unique identification code; verifying the uploaded second type file according to the universal unique identification code to determine the second type file which fails to be uploaded; and the second type file which fails to be uploaded is uploaded to the distributed file system again for object storage.
From the above description, it can be seen that the embodiments of the present application achieve the following technical effects: the first type file which is not suitable for being directly uploaded to the distributed object storage system for object storage is converted into the second type file which is suitable for being directly uploaded to the distributed object storage system for object storage, and then the second type file is uploaded to the distributed object storage system for object storage, so that the distributed object storage system can store files with any size, and the flexibility and the expandability of the distributed object storage system are effectively improved. By the scheme, the technical problem that the existing distributed object storage system is not suitable for storing files with any size is solved, and the technical effects of storing the files with any size and effectively improving the flexibility and the expansibility of the system are achieved.
The embodiment of the present application further provides a computer device, which may specifically refer to a schematic diagram of a composition structure of a computer device based on the distributed object storage method provided in the embodiment of the present application shown in fig. 5, where the computer device may specifically include an input device 51, a processor 52, and a memory 53. Wherein the memory 53 is configured to store processor-executable instructions. The processor 52, when executing the instructions, performs the steps of the distributed object storage method described in any of the embodiments above.
In this embodiment, the input device may be one of the main apparatuses for information exchange between a user and a computer system. The input device may include a keyboard, a mouse, a camera, a scanner, a light pen, a handwriting input board, a voice input device, etc.; the input device is used to input raw data and a program for processing the data into the computer. The input device can also acquire and receive data transmitted by other modules, units and devices. The processor may be implemented in any suitable way. For example, the processor may take the form of, for example, a microprocessor or processor and a computer-readable medium that stores computer-readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, an Application Specific Integrated Circuit (ASIC), a programmable logic controller, an embedded microcontroller, and so forth. The memory may in particular be a memory device used in modern information technology for storing information. The memory may include multiple levels, and in a digital system, the memory may be any memory as long as it can store binary data; in an integrated circuit, a circuit without a physical form and with a storage function is also called a memory, such as a RAM, a FIFO and the like; in the system, the storage device in physical form is also called a memory, such as a memory bank, a TF card and the like.
In this embodiment, the functions and effects of the specific implementation of the computer device can be explained in comparison with other embodiments, and are not described herein again.
The present application further provides a computer storage medium based on a distributed object storage method, where the computer storage medium stores computer program instructions, and the computer program instructions, when executed, implement the steps of the distributed object storage method in any of the above embodiments.
In this embodiment, the storage medium includes, but is not limited to, a Random Access Memory (RAM), a Read-Only Memory (ROM), a Cache (Cache), a Hard Disk Drive (HDD), or a Memory Card (Memory Card). The memory may be used to store computer program instructions. The network communication unit may be an interface for performing network connection communication, which is set in accordance with a standard prescribed by a communication protocol.
In this embodiment, the functions and effects specifically realized by the program instructions stored in the computer storage medium can be explained by comparing with other embodiments, and are not described herein again.
It will be apparent to those skilled in the art that the modules or steps of the embodiments of the present application described above may be implemented by a general purpose computing device, they may be centralized on a single computing device or distributed across a network of multiple computing devices, and alternatively, they may be implemented by program code executable by a computing device, such that they may be stored in a storage device and executed by a computing device, and in some cases, the steps shown or described may be performed in an order different from that described herein, or they may be separately fabricated into individual integrated circuit modules, or multiple ones of them may be fabricated into a single integrated circuit module. Thus, embodiments of the present application are not limited to any specific combination of hardware and software.
It is to be understood that the above description is intended to be illustrative, and not restrictive. Many embodiments and many applications other than the examples provided will be apparent to those of skill in the art upon reading the above description. The scope of the application should, therefore, be determined not with reference to the above description, but instead should be determined with reference to the pending claims along with the full scope of equivalents to which such claims are entitled.
The above description is only a preferred embodiment of the present application and is not intended to limit the present application, and it will be apparent to those skilled in the art that various modifications and variations can be made in the embodiment of the present application. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims (8)

1. A distributed object storage method, comprising:
receiving a data uploading request, wherein the data uploading request carries a first type file;
responding to the data uploading request, and converting the first type file into a second type file, wherein the difference value between the byte number of the second type file and the byte number of the first type file is larger than a preset threshold value;
uploading the second type file to a distributed object storage system for object storage;
under the condition that the byte number of the first type file is larger than a second preset byte number, the data uploading request also carries a third preset byte number and metadata of the first type file;
accordingly, converting the first type of file into a second type of file comprises:
dividing the first type file into a plurality of second type files according to the third preset byte number, and recording the offset of each second type file in the plurality of second type files, wherein the byte number of the second type files is the third preset byte number;
generating index files of the second type files according to the offset of the second type files and the metadata of the first type files;
uploading the second type file to a distributed object storage system for object storage, wherein the method comprises the following steps:
uploading the plurality of second type files to a distributed file system of the distributed object storage system for object storage, and recording service metadata generated in the uploading process;
uploading the index file of each second type file to a distributed database of the distributed object storage system for storage;
generating a control object according to the metadata of the first type file, the service metadata and the attribute information of each second type file, wherein the attribute information of the second type files comprises the number of the second type files and the third preset byte number;
uploading the control object to a distributed file system of the distributed object storage system for object storage;
generating a control object according to the metadata of the first type file, the service metadata and the attribute information of each second type file, wherein the method comprises the following steps:
determining the key value of the first type file as the key value of the control object;
generating target metadata according to the metadata of the first type file and the service metadata;
storing the target metadata, the universal unique identification code and the attribute information of the second type file into the control object; the universally unique identification code is randomly generated.
2. The method according to claim 1, wherein the data upload request carries a plurality of first type files and metadata of each of the plurality of first type files, when the number of bytes of the first type file is smaller than a first preset number of bytes;
accordingly, converting the first type of file into a second type of file comprises:
acquiring metadata of each first type file in the plurality of first type files;
merging the first type files into a second type file, and recording the positioning information of each first type file in the first type files in the second type file;
and generating an index file according to the metadata and the positioning information of each first type file.
3. The method of claim 2, wherein uploading the second type of file to a distributed object storage system for object storage comprises:
uploading the second type file to a distributed file system of the distributed object storage system for object storage;
and uploading the index file to a distributed database of the distributed object storage system for storage.
4. The method of claim 1, wherein uploading the plurality of files of the second type to a distributed file system of the distributed object storage system for object storage comprises:
generating key values of the second type files according to the universal unique identification codes and the offset of the second type files;
and uploading each second type file in the plurality of second type files, the key value of each second type file and the metadata of the first type file to the distributed object storage system for object storage.
5. The method of claim 4, wherein uploading the plurality of files of the second type to a distributed file system of the distributed object storage system for object storage further comprises:
generating error information when an error is transmitted out, wherein the error information carries the universal unique identification code;
verifying the uploaded second type file according to the universal unique identification code to determine the second type file which fails to be uploaded;
and uploading the second type file which fails to be uploaded to the distributed file system again for object storage.
6. A distributed object storage apparatus, comprising:
the receiving module is used for receiving a data uploading request, wherein the data uploading request carries a first type file;
the conversion module is used for responding to the data uploading request and converting the first type file into a second type file, wherein the difference value between the byte number of the second type file and the byte number of the first type file is larger than a preset threshold value;
the uploading module is used for uploading the second type file to a distributed object storage system for object storage;
under the condition that the byte number of the first type file is larger than a second preset byte number, the data uploading request also carries a third preset byte number and metadata of the first type file;
correspondingly, the conversion module is specifically configured to: dividing the first type file into a plurality of second type files according to the third preset byte number, and recording the offset of each second type file in the plurality of second type files, wherein the byte number of the second type files is the third preset byte number; generating index files of the second type files according to the offset of the second type files and the metadata of the first type files;
the uploading module is specifically configured to: uploading the plurality of second type files to a distributed file system of the distributed object storage system for object storage, and recording service metadata generated in the uploading process; uploading the index file of each second type file to a distributed database of the distributed object storage system for storage; generating a control object according to the metadata of the first type file, the service metadata and the attribute information of each second type file, wherein the attribute information of the second type files comprises the number of the second type files and the third preset byte number; uploading the control object to a distributed file system of the distributed object storage system for object storage; generating a control object according to the metadata of the first type file, the service metadata and the attribute information of each second type file, wherein the method comprises the following steps: determining the key value of the first type file as the key value of the control object; generating target metadata according to the metadata of the first type file and the service metadata; storing the target metadata, the universal unique identification code and the attribute information of the second type file into the control object; the universally unique identification code is randomly generated.
7. A computer device comprising a processor and a memory for storing processor-executable instructions that, when executed by the processor, implement the steps of the method of any one of claims 1 to 5.
8. A computer-readable storage medium having computer instructions stored thereon which, when executed, implement the steps of the method of any one of claims 1 to 5.
CN201910693969.4A 2019-07-30 2019-07-30 Distributed object storage method and device, computer equipment and storage medium Active CN110413588B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910693969.4A CN110413588B (en) 2019-07-30 2019-07-30 Distributed object storage method and device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910693969.4A CN110413588B (en) 2019-07-30 2019-07-30 Distributed object storage method and device, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN110413588A CN110413588A (en) 2019-11-05
CN110413588B true CN110413588B (en) 2022-05-17

Family

ID=68364091

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910693969.4A Active CN110413588B (en) 2019-07-30 2019-07-30 Distributed object storage method and device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN110413588B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110888837B (en) * 2019-11-15 2021-01-22 星辰天合(北京)数据科技有限公司 Object storage small file merging method and device
CN111143366B (en) * 2019-12-27 2020-12-01 焦点科技股份有限公司 High-efficiency storage method for massive large object data
CN113285816B (en) * 2020-02-19 2022-10-28 华为技术有限公司 Control request sending method, device and system based on key value configuration

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5822452B2 (en) * 2010-10-22 2015-11-24 株式会社インテック Storage service providing apparatus, system, service providing method, and service providing program
CN106909651A (en) * 2017-02-23 2017-06-30 郑州云海信息技术有限公司 A kind of method for being write based on HDFS small documents and being read
CN109634916A (en) * 2018-12-10 2019-04-16 平安科技(深圳)有限公司 File storage and method for down loading, device and storage medium

Also Published As

Publication number Publication date
CN110413588A (en) 2019-11-05

Similar Documents

Publication Publication Date Title
US11809726B2 (en) Distributed storage method and device
US11928029B2 (en) Backup of partitioned database tables
CN109254733B (en) Method, device and system for storing data
CN110413588B (en) Distributed object storage method and device, computer equipment and storage medium
US20190005262A1 (en) Fully managed account level blob data encryption in a distributed storage environment
US10025673B1 (en) Restoring partitioned database tables from backup
US10382380B1 (en) Workload management service for first-in first-out queues for network-accessible queuing and messaging services
US10019452B2 (en) Topology aware distributed storage system
CN107577420B (en) File processing method and device and server
US9672274B1 (en) Scalable message aggregation
US11416166B2 (en) Distributed function processing with estimate-based scheduler
US10176045B2 (en) Internet based shared memory in a distributed computing system
WO2011071104A1 (en) Distributed file system, data selection method of same and program
US10268374B2 (en) Redundant array of independent discs and dispersed storage network system re-director
US10262024B1 (en) Providing consistent access to data objects transcending storage limitations in a non-relational data store
CN104965835A (en) Method and apparatus for reading and writing files of a distributed file system
CN111367857A (en) Data storage method and device, FTP server and storage medium
US11347571B1 (en) Sub-routing key splits in scaling distributed streaming data storage
US9684668B1 (en) Systems and methods for performing lookups on distributed deduplicated data systems
US11137980B1 (en) Monotonic time-based data storage
US20180314710A1 (en) Flattened document database with compression and concurrency
US10254980B1 (en) Scheduling requests from data sources for efficient data decoding
CN114218013A (en) Searching method, searching device and electronic equipment storage medium
CN114490509A (en) Tracking change data capture log history
US11340964B2 (en) Systems and methods for efficient management of advanced functions in software defined storage systems

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant