WO2021237467A1 - Procédé de téléversement de fichier, procédé de téléchargement de fichier et appareil de gestion de fichiers - Google Patents

Procédé de téléversement de fichier, procédé de téléchargement de fichier et appareil de gestion de fichiers Download PDF

Info

Publication number
WO2021237467A1
WO2021237467A1 PCT/CN2020/092383 CN2020092383W WO2021237467A1 WO 2021237467 A1 WO2021237467 A1 WO 2021237467A1 CN 2020092383 W CN2020092383 W CN 2020092383W WO 2021237467 A1 WO2021237467 A1 WO 2021237467A1
Authority
WO
WIPO (PCT)
Prior art keywords
file
block
storage
information
meta
Prior art date
Application number
PCT/CN2020/092383
Other languages
English (en)
Chinese (zh)
Inventor
许若阳
Original Assignee
深圳元戎启行科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳元戎启行科技有限公司 filed Critical 深圳元戎启行科技有限公司
Priority to CN202080007587.2A priority Critical patent/CN113273163A/zh
Priority to PCT/CN2020/092383 priority patent/WO2021237467A1/fr
Publication of WO2021237467A1 publication Critical patent/WO2021237467A1/fr

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/06Protocols specially adapted for file transfer, e.g. file transfer protocol [FTP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/104Peer-to-peer [P2P] networks
    • H04L67/1074Peer-to-peer [P2P] networks for supporting data block transmission mechanisms
    • H04L67/1078Resource delivery mechanisms
    • H04L67/108Resource delivery mechanisms characterised by resources being split in blocks or fragments
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]

Definitions

  • This application relates to the field of data storage technology, and in particular to a file upload method, file download method and file management device.
  • the storage servers of these storage service providers can provide access to object storage services, such as Object-Based Storage Systems, for corporate users to perform Storage of mobile applications, large-scale websites, picture sharing or hotspot audio and video, or low-frequency access storage and archive storage, or storage of files for individual users, etc.
  • object storage services such as Object-Based Storage Systems
  • This type of service can provide flat file storage and Content Delivery Network (CDN) resources to improve the loading speed of static resources by users.
  • CDN Content Delivery Network
  • the user uploads the file to the storage pool of the storage server through the back-end program of the user terminal, and can obtain the Uniform Resource Locator (URL) returned by the storage server, and embed the address in the web page or application program interface (In the data returned by Application Programming Interface (API), the user can download the previously uploaded file by virtue of the URL.
  • URL Uniform Resource Locator
  • API Application Programming Interface
  • a method for uploading files includes: obtaining a file to be uploaded; dividing the file to be uploaded into at least one block file; uploading at least one block file to a storage server; receiving a file corresponding to at least one block file returned by the storage server At least one storage address; and the meta-information of the stored file; the meta-information includes the first file identifier of the associated stored file, at least one storage address, and the arrangement order of the at least one block file corresponding to the at least one storage address before the file is split information.
  • a file download method includes: obtaining a first file identification of a file to be downloaded; obtaining meta information of the file based on the first file identification; meta information including the first file identification of the file, at least one storage address, and at least one storage address Information about the arrangement sequence of the corresponding at least one block file before the file is split; download at least one block file from a storage server corresponding to the at least one storage address by using at least one storage address; and arrange based on the at least one block file Sequence information, restore at least one block file to a complete file and return the complete file.
  • a file management device the file management device is in communication connection with a storage server; the file management device includes a processor and an information memory.
  • the processor is used to execute the above-mentioned file upload method and file download method.
  • the file to be uploaded is divided into at least one block file, and the at least one block file is respectively stored in a storage server, and when the file is downloaded At this time, download at least one block file from the storage server, and merge the at least one block file according to the division order of the block file recorded locally to obtain the restored file.
  • the files stored on the storage server are incomplete block files that are divided into blocks, and it is difficult to obtain information such as the arrangement order of these block files from the storage server, so that it is difficult to restore the complete storage files from the storage server, which effectively improves This improves the security of files stored on the storage server.
  • Figure 1 is an application environment diagram of a file upload method and a file download method in an embodiment
  • FIG. 2 is a diagram of the application environment of the file upload method and the file download method in another embodiment
  • FIG. 3 is a schematic flowchart of a file upload method in an embodiment
  • Figure 4 is a schematic flowchart of a file upload method in an embodiment
  • FIG. 5 is a schematic flowchart of a file upload method in an embodiment
  • FIG. 6 is a schematic flowchart of a file upload method in an embodiment
  • FIG. 7 is a schematic diagram of the structure of an information storage in an embodiment
  • FIG. 8 is a schematic flowchart of a file download method in an embodiment
  • FIG. 9 is a schematic flowchart of a file download method in an embodiment
  • Figure 10 is a schematic diagram of a file upload method and a file download method in an embodiment
  • Figure 11 is a structural block diagram of a file uploading device in an embodiment
  • Figure 12 is a structural block diagram of a file downloading device in an embodiment
  • Fig. 13 is a structural block diagram of a file management device in an embodiment.
  • the file upload method and file download method provided in this application can be applied to the application environment as shown in FIG. 1.
  • the user equipment 102 communicates with the intermediate server 104 through the network, and the intermediate server 104 communicates with the storage server 106 through the network.
  • the storage server 106 is usually a third-party server.
  • the user equipment 102 may be an enterprise user equipment or a personal user terminal, and the enterprise user equipment may be an enterprise server and/or an enterprise user terminal.
  • Personal user terminals and enterprise user terminals can be, but are not limited to, various personal computers, notebook computers, smart phones, tablet computers, etc.
  • the storage service provider provides storage services through the storage server 106, and the storage server 106 is provided with a storage pool for storing files.
  • the file upload method and file download method provided in this application can be executed by a file management device.
  • the file management device can store a back-end program.
  • the back-end program can be deployed on the intermediate server 104.
  • the file upload method and file download method of the application In other embodiments, a part of the back-end program may also be deployed in the intermediate server 104, and the other part may be deployed in the user equipment 102 by the intermediate server 104, or the intermediate server 104 may also deploy all the back-end programs in the user equipment. In equipment 102.
  • the file uploading method and file downloading method of the present application can be executed in the intermediate server 104 as part of the steps and executed in the back-end program of the user equipment 102, or all steps can be executed in the back-end of the user equipment 102 Executed in the program.
  • the file management apparatus may include an information storage 110, and the information storage 110 may include an internal memory and a non-volatile storage medium.
  • the back-end program may be stored on the non-volatile storage medium.
  • a database may also be stored on the non-volatile storage medium, and the database may be used to store the index information of the uploaded file.
  • the index information may include the meta-information of the file, or the meta-information of the file may be obtained through the index information.
  • the index information may also include the storage address of the associated meta-information and the first file identifier.
  • the information storage 110 may be located in the user equipment 102 or in the intermediate server 104.
  • the storage services provided by multiple different storage service providers can be comprehensively utilized.
  • the intermediate server 104 can perform the processing through the network with multiple storage servers 106 corresponding to multiple storage service providers.
  • the file management apparatus may choose to store the divided at least one block file in a plurality of different storage servers 106 respectively.
  • Both the intermediate server 104 and the storage server 106 can be implemented by independent servers or a server cluster composed of multiple servers.
  • the index information of the file can be stored in the Redis database, and the file system directory structure can be realized in the database by using the Hash and Set data structure. According to the actual situation, you can choose whether to cache the file meta-information together in the Redis database, or record the URL of the web page address where the meta-information is located. At the same time, the file name, actual file length, file creation and modification time and other information are stored in the Redis database to speed up the query of such information.
  • the Redis database communicates using the TCP protocol, so that multiple framework instances can be configured to connect to the same Redis database instance, thereby easily achieving file index synchronization.
  • Redis database supports atomic operations, which can effectively avoid various competition situations.
  • the Redis database supports a master-slave architecture, which is convenient for expansion and can provide high availability guarantee.
  • a directory is created in another file system as the root directory of this file system, and all operations with a program interface granularity greater than the file's own reading and writing can be completed by the system calling the POSIX interface, that is, a Transparent proxy for directory structure. Then the meta-information of this framework file is stored in files in the corresponding directory structure of other file systems. When file data operations are performed, the request is forwarded to the bottom layer of the framework abstract file for processing.
  • the file system transparent proxy method uses an existing mature file system to host the file index of the framework, which has high stability and security.
  • part of the data in the meta-information can be directly consistent with the ATTR attributes supported in the existing file system, such as the creation time ctime and the modification time mtime.
  • the creation time ctime For attributes that cannot be kept consistent, such as the real size of a file, it can be obtained by simulating the output by reading the meta-information at the transparent proxy layer.
  • a file upload method is provided.
  • the file upload method is applied to the intermediate server 104 in FIG. Deploying on the intermediate server 104 is taken as an example for description, including the following steps S302-S310.
  • Step S302 Obtain the file to be uploaded.
  • the intermediate server 104 receives a file upload request sent by the user equipment 102, and the file upload request carries a file to be uploaded.
  • the intermediate server 104 parses the file upload request to obtain the file to be uploaded.
  • the file management device can provide both an HTTP API interface and a POSIX interface. Take the user equipment 102 using the HTTP API interface to perform file upload operations as an example. As shown in Figure 10, the user uses the web front end of the browser or a dedicated client or other means to directly initiate a POST/PUT method type to the intermediate server 104 File upload request (POST/PUT request for short), the request body of the POST/PUT request body part carries the file to be uploaded. Therefore, in this step, the intermediate server 104 can receive the POST/PUT request from the user equipment 102 that carries the file to be uploaded.
  • POST/PUT request for short
  • Step S304 Divide the file to be uploaded into at least one partial file.
  • the intermediate server 104 may divide the file to be uploaded into one or more divided files according to a predetermined file division rule.
  • the intermediate server 104 instantiates and obtains an abstract file object.
  • the abstract file object can provide a variety of methods for different developers to call and use, for example, it can be used in writing mode. The flush, truncate, and write methods, and the locate, read, seek, and tell methods available in read mode.
  • the write method internally cuts the file data stream according to the pre-configured block size to obtain Multi-segment block file data stream.
  • the multi-segment block file data stream represents at least one corresponding block file.
  • Step S306 Upload at least one block file to the storage server.
  • the storage server 106 provides an upload interface, and the intermediate server 104 can respectively send a plurality of file upload requests to the storage server 106 through the upload interface provided by the storage server 106, and each file upload request carries One segment of the multiple data streams, so that the multiple data streams are uploaded to the designated storage server 106.
  • multiple data streams that is, at least one block file
  • the synchronous upload of at least one block file in the macro so as to make full use of bandwidth resources and accelerate the transmission speed.
  • Step S308 Receive at least one storage address corresponding to the at least one block file returned by the storage server.
  • the storage server 106 When the storage server 106 receives any block file uploaded by the intermediate server 104, it stores the block file in the storage pool of the storage server 106 and sends the corresponding storage address to the intermediate server 104.
  • the storage address is used to indicate the storage location of the block file in the storage server 106.
  • the storage address may be a Uniform Resource Locator (URL).
  • Step S310 Store meta-information of the file, where the meta-information of the file includes the first file identifier of the associated stored file, at least one storage address, and the arrangement sequence of the at least one block file corresponding to the at least one storage address before file splitting.
  • the meta information of the file may record information related to various processing performed on the file during the file upload process, such as disguising, dividing, uploading, and so on.
  • the meta-information may also include other file-related information, such as the file name of the file carried in the file upload request of the file, the MIME type of the file, the actual file length, the modification time of the file, etc. .
  • the file data stream can be extracted from the request body of the file upload request, and the file data stream can be divided to obtain multiple data streams, and the multiple data streams can be uploaded.
  • the other file-related information extracted in the request is stored in the meta-information.
  • the first file identifier is information that uniquely identifies the file. When two files have the same first file identifier, it can be considered that the two files are the same file.
  • the first file identifier may be a check value of the file, such as a file fingerprint. By calculating the file fingerprint, the content of the file can be compared with higher accuracy.
  • the first file identifier may also be other information related to the file according to the requirements for the accuracy of file recognition, for example, it may be the storage path of the file in the information storage and so on.
  • the file to be uploaded is divided into a plurality of block files, and the plurality of block files are respectively stored in a storage server.
  • the files stored in the storage server are incomplete Block files, and it is difficult to obtain information such as the arrangement order of these block files from the storage server, so that it is difficult to restore the complete storage file from the storage server, which effectively improves the security of the files stored on the storage server.
  • storing the meta information of the file in step S310 includes: storing the meta information of the file in an information storage.
  • the meta-information of the file is stored in the information storage.
  • the meta-information of the file cannot be obtained from the storage server, making it difficult to restore the uploaded original file, which further improves the security of the file stored in this application.
  • the file meta-information of this application can also be uploaded to the storage server.
  • the meta-information of the stored file in the above step S310 includes: uploading the meta-information of the file to the storage server; receiving the storage address of the meta-information returned by the storage server; and storing the storage address of the meta-information in association with the first file identifier of the file In the information store.
  • the storage address of the meta-information can also be a URL.
  • the information storage of this information storage can only store the storage address of the meta-information of the file and the information of the first file identification, thereby reducing the cost of this information storage.
  • the burden of data storage saves the data storage capacity of the system.
  • the dividing the file to be uploaded into multiple block files in step S304 includes: dividing the file to be uploaded based on a predetermined size to obtain a first number of files with all the files.
  • the block file of the predetermined size the value of the predetermined size is less than or equal to the upper limit of the file size allowed to be stored by the storage server, and the first number is the difference between the size of the file to be stored and the predetermined size Quotient rounded value; when the first number of block files with the predetermined size are obtained by dividing, if there is a remaining part of the file, the remaining part of the file is allocated as a block file .
  • the block sizes of at least one block file may be the same or different from each other, as long as the predetermined size of each block file is less than or equal to the block file
  • the upper limit value of the file size allowed by the storage server to be stored correspondingly is sufficient.
  • Some storage service providers have restrictions on the file size allowed to be uploaded to their storage server.
  • the solution of the above embodiment of this application divides the file to be stored into a size less than or equal to the upper limit of the file size allowed by the storage server. Block files to meet the storage service provider’s limit on the size of the storage file, so that files of any size can be uploaded.
  • the meta-information of the file to be stored is small and can meet the file size limit of the storage server.
  • the meta-information is large and may exceed the upper limit of the file size allowed by the storage server.
  • the meta information can also be divided so that the size of each block meta information obtained by the division is smaller than the upper limit, and then multiple block meta information obtained by the division are uploaded to the storage server respectively.
  • the storage address of the block meta-information can be used in the same way to obtain the meta-information of the file.
  • the file upload method further includes: S404, selecting an encoder with a target file format, and using the encoder to disguise at least one block file respectively
  • the target file format includes a target file format; wherein the target file format includes a file format allowed by the storage server; correspondingly, the meta information stored in step S310 also includes at least one encoder corresponding to the at least one block file used for disguising. Information.
  • the encoder is a module used to disguise the input file of any file format into a file of the specified target file format and then output it.
  • the target file format of the encoder is the target file format that the file is disguised as.
  • the encoder may perform processing of adding a file header and a file tail or some other change processing to each received data stream, so as to disguise the block file represented by the data stream as a block file in the target file format.
  • At least one block file can be disguised as a plurality of block files of the same file format.
  • an encoder can be selected from at least one encoder with the same target file format to compare at least one block file. For camouflage, setting at least one encoder can avoid a situation where a single encoder fails and the camouflage processing cannot be performed.
  • the at least one block file can also be disguised as a plurality of block files with different file formats. In this case, at least one encoder with different target file formats can be selected to disguise the at least one block file respectively.
  • each piece of block file data stream obtained by cutting can be directed to the corresponding code with the target file format.
  • the encoder can add file header and file tail processing or some other change processing to each received block file data stream, so as to disguise the block file represented by the block file data stream as the target file format. Block files.
  • the file upload method further includes: step S502, obtaining the first file identifier of the file to be uploaded; step S504, searching the first file in the information storage and the storage server File identification; step S506, determine whether the first file identification is stored in the information storage and the storage server; when the first file identification is not stored in the information storage and the storage server, continue to execute the waiting Step S406 where the uploaded file is divided into at least one block file; when the first file identifier has been stored in the information storage or the storage server, step S508 is executed to terminate the upload of the file operate.
  • the first file identifier of the file to be uploaded is searched in the information storage and the storage server. If the stored first file identifier is found in either of the information storage and the storage server , It means that the same file has been uploaded before, the upload operation of the file is terminated, and only the metadata of the file is updated and stored. Avoid repeated uploads occupying unnecessary storage space and system processing resources.
  • the first file identifier is a file fingerprint
  • the file fingerprint of the file to be uploaded can be calculated based on the fixed byte length data of the header of the file data stream of the file and the actual file length.
  • the intermediate server 104 receives the POST/PUT request, it first parses the request header to obtain the original data length, that is, the value of Content-Length in the request header. Then obtain the data format of the request body, which is specified by the value of Content-Type in the request header.
  • the intermediate server 104 receives a user-initiated data write request with a write (WRITE) method type through the user space file system (Filesystem in Userspace, FUSE) (Referred to as a WRITE request), and then extract the data content of the file to be uploaded from the WRITE request and calculate the first file identifier.
  • WRITE write
  • FUSE user space file system
  • meta-information of the file may also be stored.
  • the information related to the block file can be selected from the previously stored first A file identifier corresponding to the meta-information is obtained, and other file-related information, such as upload time, can be obtained from the file upload request of the file.
  • the file upload method further includes: step S402, calculating at least one second file identifier corresponding to the at least one block file; wherein, step S310
  • the meta-information stored in further includes at least one second file identifier corresponding to the at least one block file stored in association.
  • the second file identification is information used to uniquely identify the file. By calculating and storing the second file identifier of the block file before performing the file disguise, the characteristic information of the original block file before the disguise can be recorded.
  • the intermediate server 104 can collect the block URL, the fingerprint of the block file, and the block file data stream before the disguise and the block file data stream after the disguise through the abstract file object.
  • a set of information such as the start offset in the block file data stream and the data length of the block file data stream before masquerading are obtained to obtain N sets of information corresponding to the N block files.
  • the N groups of information are arranged to form a block information sequence according to the sequence of the block files before the file is split. Specifically, the starting offset of the block file can be used as a key, and the rest of the information such as the block URL, the fingerprint of the block file, and the data length can be used as values to form the key-value of the block file.
  • a dictionary (dict).
  • the sorted dictionary is the block information sequence. Then serialize the block information sequence with the file name, file modification time, MIME type, file fingerprint, upload time, encoder used and other information using JSON and other formats to obtain the meta information of the file.
  • the file name extension has a one-to-one correspondence with the MIME type. Therefore, the MIME type may not be included in the meta-information of the stored file.
  • this application can also search for block files before uploading the file.
  • the file upload method further includes: step S602, searching for at least one second file identifier in the information storage and the storage server, respectively Each second file ID in.
  • step S306 and step S308 include: step S604, using the second file identifier of the at least one second file identifier that is not stored in the information storage and storage server as the target second file identifier, and assigning the target second file identifier to the corresponding The block file is uploaded to the storage server; and step S606, the storage address of the block file corresponding to the target second file identifier sent by the storage server is received.
  • the second file identifier may be the check value of the file, for example, the file fingerprint of the file.
  • the file fingerprint of the file By calculating the file fingerprint of the file, the content of the file can be performed with higher accuracy. Comparison.
  • a hash algorithm such as SHA-1 can be used to iteratively calculate the check value of the file.
  • the file to be uploaded is first divided into block files, and then the block file is used as a unit to find whether each block file has previously uploaded the same file in the information storage of the intermediate server and in the storage server.
  • the file upload method further includes: step S608, identifying the first file that has been stored in the information storage or the storage server among the at least one second file identifier.
  • the second file identifier is used as the stored second file identifier, and the storage address of the block file corresponding to the stored second file identifier is obtained.
  • the meta-information stored in step S310 includes the first file identifier, the target second file identifier, the stored second file identifier, the storage address of the block file corresponding to the target second file identifier, and the stored file identifier of the associated stored file.
  • the meta information of the segmented files that have been uploaded before can be separately uploaded, Combined with the meta-information of the currently uploaded block file, the complete meta-information of the file to be uploaded is obtained, so that the corresponding original file can be downloaded based on the meta-information of the file when downloading.
  • the information storage 110 of the present application may store a database, and the database includes at least one public database 111 and multiple local databases 112.
  • each user equipment 102 corresponds to a local database 112, and the local database 112 is dedicated to storing the index information of the corresponding user equipment 102.
  • the local database 112 can be configured on the personal user terminal.
  • the local database can be configured on the enterprise server or on the enterprise user terminal of the enterprise.
  • the public database 111 may be used to store index information of sharable files uploaded by multiple user equipment 102. In this way, the index information of the file can be shared across instances, so that when new index information is added to any instance using this framework, the other instances can query the index information, thereby sharing the file to all instances.
  • the number of the storage server is multiple; the number of the block file is multiple; the uploading the at least one block file to the storage server includes: The block files are respectively uploaded to the plurality of storage servers, so that each storage server of the plurality of storage servers stores a part of the block files of the plurality of block files. In this way, each storage server saves only part of the block files of the multiple block files of the file, but does not save all the block files of the file, making it difficult to restore the complete original file from a single storage server, which improves the file storage Security.
  • the number of the storage server is multiple; the uploading the at least one block file to the storage server includes: uploading the at least one block file to the multiple storage servers respectively , So that the at least one block file is repeatedly stored in at least two storage servers of the plurality of storage servers. In this way, each block file is backed up and stored in more than two storage servers.
  • downloading block files if the correct block file cannot be downloaded from a storage server, you can also download it from a backup storage server Download files in blocks, thereby ensuring the reliability of file downloads.
  • storing at least one block file in the storage server includes: storing the multiple block files to multiple Storage server, so that each of the plurality of storage servers stores a part of the plurality of block files, and each of the plurality of block files is stored in at least one of the plurality of storage servers Two storage servers. In this way, the reliability of file download can be improved while ensuring the security of file storage.
  • the above-mentioned file uploading method may further include the step of specifying to update data in any byte range of an uploaded file. Specifically, when the intermediate server 104 receives the file update request of the specified byte range of the update file, it first finds the metadata of the old file uploaded previously based on the first file identifier, and then updates the file based on the file specified in the file update request.
  • Range determine one or more uploaded block files partially or fully covered by the file update range, based on the start and end positions of each block file before disguise corresponding to these uploaded block files, Split and disguise the new file to be uploaded in the file update request, then replace the old block file stored in the storage server with the new block file after division and disguise, and update the meta information of the replaced block file , You can update the data of any byte range of the uploaded file.
  • the start position and end position of the update range of the file requested to be updated are not completely aligned with the start position and end position of the old block file uploaded previously, that is, there is an offset difference between them .
  • the corresponding start block file and/or the end block file of the downloaded multiple block files can be divided, and the specified byte range can be removed.
  • reserve and restore one or more block files within the specified byte range so as to accurately download the file that meets the specified byte range.
  • the start position and the end position of each block file can be calculated according to the data length of the block file recorded in the meta information of each block file.
  • HTTP Range can also be used to specify the file update range, and the Web server can parse the POST/PUT request initiated by the client to obtain the Range data in the request header.
  • the present application provides a file download method. Taking the file download method applied to the intermediate server 104 in FIG. 1 and the back-end program of this application deployed on the intermediate server 104 as an example for description, the method may include the following steps S802-S810.
  • Step S802 Obtain the first file identifier of the file to be downloaded.
  • the intermediate server 104 may receive a file download request sent by the user equipment 102, where the file download request carries the first file identifier of the file to be downloaded.
  • the intermediate server 104 parses the file download request to obtain the first file identifier of the file to be downloaded.
  • Step S804 Obtain the meta information of the file based on the first file identifier; where the meta information of the file includes the first file identifier of the file, at least one storage address, and at least one block file corresponding to the at least one storage address in the file before splitting. Sort order information.
  • the intermediate server 104 finds the meta information of the file based on the first file identifier of the file.
  • Step S806 Use at least one storage address to download at least one block file from a storage server corresponding to the at least one storage address.
  • the intermediate server 104 uses at least one storage address to generate corresponding multiple file download requests, and sends the multiple file download requests to the corresponding one or more storage servers 106, from each storage address.
  • the storage server corresponding to the address downloads the corresponding piece of block data, so as to obtain at least one piece of file corresponding to the at least one storage address one-to-one.
  • the intermediate server 104 needs to know which storage server the storage address corresponds to before sending the file download request corresponding to each storage address.
  • the intermediate server 104 can read the information of the corresponding storage server from the storage address.
  • Step S808 based on the information of the arrangement sequence of the at least one block file, restore the at least one block file to a complete file and return the complete file.
  • the intermediate server 104 may restore the at least one block file to a complete file based on the information about the arrangement sequence of the at least one block file corresponding to the at least one storage address stored in the meta-information before the file is split. Describe the complete file to the user device 102.
  • the above-mentioned file download method of the present application can restore the downloaded at least one block file to a complete file based on the stored information of the arrangement sequence of the at least one block file, and return the complete file to the user device 102.
  • the step S504 in the above-mentioned file download method is based on the first file identification
  • the obtaining of the meta-information of the file includes: based on the first file identification, searching and combining in the information storage Get the meta information of the file.
  • the meta-information of the file is stored in the information storage.
  • the meta-information of the file cannot be obtained from the storage server, making it difficult to restore the uploaded original file, which further improves the security of the file stored in this application.
  • step S804 of the above-mentioned file downloading method based on the first file identification, obtaining the meta-information of the file includes: searching and obtaining in the information storage based on the first file identification The storage address of the meta-information of the file; and using the storage address of the meta-information to download the meta-information of the file from the storage server corresponding to the storage address of the meta-information.
  • the information storage can only store the storage address of the meta-information of the file and the information identified by the first file, thereby reducing the data storage burden of the information storage. Save the data storage capacity of the system.
  • the meta information obtained in step S804 further includes at least one second file identifier corresponding to at least one block file stored in association; after step S806, and before step S808, the file
  • the download method further includes: step S904, calculating at least one third file identifier corresponding to the downloaded at least one block file.
  • Step 906 When the third file identifier of the divided file matches the second file identifier of the divided file, it is determined that the divided file passes the verification.
  • Step 908 When the third file identifier of the divided file does not match the second file identifier of the divided file, it is determined that the divided file has not passed the verification, and the divided file is re-downloaded.
  • the block file replaces the block file that fails the verification.
  • the third file identifier and the second file identifier are both information used to uniquely identify the segmented file.
  • the second file identifier and the third file identifier may be the check value of the block file, for example, the block file fingerprint of the block file, by calculating the block file of the block file File fingerprints can compare the content of the block files with higher accuracy.
  • a hash algorithm such as SHA-1 may be used to iteratively calculate the check value of the block file.
  • the characteristic information of the original block file before the disguise can be recorded.
  • the third file ID of the block file before disguise is calculated and compared with the second file ID of the block file uploaded before disguise.
  • a redundant part when uploading files, before disguising the block files in step S404, a redundant part may be added to each block file, and the redundant part includes an erasure code, for example, Reed-Solomon encoding (Reed-Solomon encoding, RS encoding for short); when downloading files, when verifying each block file obtained by the download, it is used as an alternative to the above step 908, if the current block file fails the calibration In order to improve the stability of file download, the erasure code of the block file can be used to restore the block file. Correspondingly, before the complete file is restored in step S808, the redundant part in the block file needs to be deleted, so that the restored complete file is consistent with the original file.
  • an erasure code for example, Reed-Solomon encoding (Reed-Solomon encoding, RS encoding for short)
  • the meta-information obtained in step S804 further includes information of at least one encoder corresponding to at least one block file; after step S806 and before step S904, the file download method It further includes: S902, obtaining at least one decoder corresponding to the at least one encoder based on the at least one encoder; using the at least one decoder to restore the corresponding at least one block file to at least one before the disguise. Block file.
  • the decoder is a module used to restore the input file that has been disguised as the specified target file format to the file before disguise.
  • the target file format possessed by the decoder means that the decoder will restore the file disguised as the target file format to its original format.
  • the decoder can extract the block file part before disguise from each received data stream, and ignore or delete the file header and file tail part added by disguise to obtain the block file before disguise.
  • the encoder and decoder have a one-to-one correspondence. In this embodiment, after the information of the encoder is obtained, the information of the corresponding decoder can be obtained, and the information of the corresponding decoder is used to decode the block file after disguise into the block file before disguise.
  • step S806 specifically includes : Using the position of each block file before disguise in the corresponding block file after disguise, download the block file before disguise from the block file after disguise stored in the storage server corresponding to the storage address.
  • the position of the block file before the disguise in the block file after the disguise includes the position interval composed of the start position and the end position of the block file before the disguise in the block file after the disguise.
  • the position interval may be determined based on the data length of the block file before masquerading and the starting offset of the block file before masquerading in the block file after masquerading. This information can be recorded in the meta-information of the file.
  • the HTTP Range request header can be used to specify to download data in a specified byte range of a certain block file when downloading a block file. In this way, the block file before the disguise can be downloaded directly, which further saves data transmission traffic and also saves the performance cost of the system.
  • the Web server can obtain the value of the meta-information mentioned above that can extract the file from the URL PATH or form part of the GET/POST request initiated by the client, such as the first file identifier, and then parse the Range data in the request header For the file download range specified in the Range data, determine and download one or at least one block file corresponding to the file download range to return the specified download data.
  • start position and end position of the requested specified byte range are not completely aligned with the start and end positions of the stored block file, that is, there is an offset between them, it can be based on the specified byte range Start position and end position, split the corresponding start block file and/or end block file of the downloaded multiple block files, remove the part outside the specified byte range, retain and restore the specified One or more block files within the byte range, so as to accurately download the file that meets the specified byte range.
  • the number of storage servers is multiple; the number of block files is multiple; each block file of the multiple block files is repeatedly stored in the multiple storage servers
  • Said downloading the at least one block file from the storage server corresponding to the at least one storage address by using the at least one storage address includes: according to the The first storage address of the multiple storage addresses corresponding to each block file downloads the corresponding block file from the storage server corresponding to the first storage address; when the first storage address is used from the first storage address to the first storage address.
  • the second storage address of the plurality of storage addresses corresponding to the block file is used to download from the storage corresponding to the second storage address.
  • the server downloads the corresponding block file.
  • the download using the second storage address fails, if there are other storage addresses, you can continue to switch and use other storage addresses to download the block file until the required block file is downloaded. In this way, the reliability of file download can be effectively improved.
  • a file uploading device 1100 including: an upload file acquisition module 1101, a file segmentation module 1102, a file upload module 1103, a storage address receiving module 1104, and a meta-information storage module 1105 ,
  • the upload file obtaining module 1101 is used to obtain files to be uploaded.
  • the file segmentation module 1102 is used to segment the file to be uploaded into at least one segmented file.
  • the file upload module 1103 is used to upload at least one block file to the storage server.
  • the storage address receiving module 1104 is configured to receive at least one storage address corresponding to at least one block file returned by the storage server.
  • the meta-information storage module 1105 is used to store the meta-information of the file.
  • a file downloading device 1200 including: a file identification obtaining module 1201, a meta information obtaining module 1202, a file downloading module 1203, and a file restoring module 1204, wherein:
  • the file identifier obtaining module 1201 is used to obtain the first file identifier of the file to be downloaded.
  • the meta-information acquiring module 1202 is configured to acquire meta-information of the file based on the first file identifier.
  • the meta-information of the file includes the first file identifier of the file, at least one storage address, and information about the arrangement sequence of the at least one block file corresponding to the at least one storage address in the file before division.
  • the file download module 1203 is configured to use at least one storage address to download at least one block file from a storage server corresponding to the at least one storage address.
  • the file restoration module 1204 is configured to restore at least one block file to a complete file and return the complete file based on the information of the arrangement sequence of the at least one block file.
  • the present application provides a file management device 1300, the file management device 1300 is in communication connection with the storage server 106; the file management device 1300 includes a processor 1301 and an information storage 110; the processor 1301 uses At:
  • the information storage 110 is configured to store meta-information of the file or a storage address of the meta-information, where the meta-information of the file includes the first file identifier of the file, the at least one storage address, and the at least one storage address The corresponding arrangement sequence of the at least one divided file before the file is divided, and the storage server is used to store the uploaded file.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

La présente invention concerne un procédé de téléversement de fichier, un procédé de téléchargement de fichier et un appareil de gestion de fichiers. Le procédé de téléversement de fichier comprend les étapes consistant à : acquérir un fichier à téléverser ; segmenter le fichier à téléverser en au moins un fichier de bloc ; téléverser le ou les fichiers de bloc vers un serveur de stockage ; recevoir au moins une adresse de stockage renvoyée par le serveur de stockage et correspondant audit au moins un fichier de bloc ; et stocker des méta-informations du fichier, les méta-informations comprenant un premier identifiant de fichier du fichier et la ou les adresses de stockage, qui sont stockés de manière associée, et un ordre d'arrangement du ou des fichiers de bloc correspondant à la ou aux adresses de stockage dans le fichier avant segmentation.
PCT/CN2020/092383 2020-05-26 2020-05-26 Procédé de téléversement de fichier, procédé de téléchargement de fichier et appareil de gestion de fichiers WO2021237467A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202080007587.2A CN113273163A (zh) 2020-05-26 2020-05-26 文件上传方法、文件下载方法和文件管理装置
PCT/CN2020/092383 WO2021237467A1 (fr) 2020-05-26 2020-05-26 Procédé de téléversement de fichier, procédé de téléchargement de fichier et appareil de gestion de fichiers

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/092383 WO2021237467A1 (fr) 2020-05-26 2020-05-26 Procédé de téléversement de fichier, procédé de téléchargement de fichier et appareil de gestion de fichiers

Publications (1)

Publication Number Publication Date
WO2021237467A1 true WO2021237467A1 (fr) 2021-12-02

Family

ID=77227980

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/092383 WO2021237467A1 (fr) 2020-05-26 2020-05-26 Procédé de téléversement de fichier, procédé de téléchargement de fichier et appareil de gestion de fichiers

Country Status (2)

Country Link
CN (1) CN113273163A (fr)
WO (1) WO2021237467A1 (fr)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114338653A (zh) * 2021-12-29 2022-04-12 中国电信股份有限公司 文件断点续传方法及装置
CN115481158A (zh) * 2022-09-22 2022-12-16 北京泰策科技有限公司 一种数据分布式缓存自动加载与转换方法
CN116527539A (zh) * 2023-05-15 2023-08-01 合芯科技(苏州)有限公司 数据一致性校验方法、装置及计算机设备

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114168537A (zh) * 2021-11-27 2022-03-11 深圳市连用科技有限公司 一种上传文件的方法及终端设备
CN114978555B (zh) * 2022-08-01 2022-10-21 北京惠朗时代科技有限公司 基于web脚本数据流运算的远程在线电子签章系统

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101227460A (zh) * 2007-01-19 2008-07-23 秦晨 分布式文件上传、下载方法及其装置和系统
CN103442090A (zh) * 2013-09-16 2013-12-11 苏州市职业大学 一种数据分散存储的云计算系统
CN103685162A (zh) * 2012-09-05 2014-03-26 中国移动通信集团公司 文件存储和共享方法
CN103729470A (zh) * 2014-01-20 2014-04-16 刘强 一种基于不同云存储端的安全存储方法
CN103873504A (zh) * 2012-12-12 2014-06-18 鸿富锦精密工业(深圳)有限公司 数据分块存储至分布式服务器的系统及方法
CN105718808A (zh) * 2016-01-18 2016-06-29 天津科技大学 一种基于多网盘的文件加密存储系统及方法

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103731451B (zh) * 2012-10-12 2018-10-19 腾讯科技(深圳)有限公司 一种文件上传的方法及系统
CN103324552B (zh) * 2013-06-06 2016-01-13 西安交通大学 两阶段单实例去重数据备份方法
CN111049884A (zh) * 2019-11-18 2020-04-21 武汉方始科技有限公司 一种分布式大文件存储系统及文件上传和下载方法

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101227460A (zh) * 2007-01-19 2008-07-23 秦晨 分布式文件上传、下载方法及其装置和系统
CN103685162A (zh) * 2012-09-05 2014-03-26 中国移动通信集团公司 文件存储和共享方法
CN103873504A (zh) * 2012-12-12 2014-06-18 鸿富锦精密工业(深圳)有限公司 数据分块存储至分布式服务器的系统及方法
CN103442090A (zh) * 2013-09-16 2013-12-11 苏州市职业大学 一种数据分散存储的云计算系统
CN103729470A (zh) * 2014-01-20 2014-04-16 刘强 一种基于不同云存储端的安全存储方法
CN105718808A (zh) * 2016-01-18 2016-06-29 天津科技大学 一种基于多网盘的文件加密存储系统及方法

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114338653A (zh) * 2021-12-29 2022-04-12 中国电信股份有限公司 文件断点续传方法及装置
CN115481158A (zh) * 2022-09-22 2022-12-16 北京泰策科技有限公司 一种数据分布式缓存自动加载与转换方法
CN116527539A (zh) * 2023-05-15 2023-08-01 合芯科技(苏州)有限公司 数据一致性校验方法、装置及计算机设备
CN116527539B (zh) * 2023-05-15 2023-11-28 合芯科技(苏州)有限公司 数据一致性校验方法、装置及计算机设备

Also Published As

Publication number Publication date
CN113273163A (zh) 2021-08-17

Similar Documents

Publication Publication Date Title
WO2021237467A1 (fr) Procédé de téléversement de fichier, procédé de téléchargement de fichier et appareil de gestion de fichiers
US20200412525A1 (en) Blockchain filesystem
US8843454B2 (en) Elimination of duplicate objects in storage clusters
US8990257B2 (en) Method for handling large object files in an object storage system
US9183213B2 (en) Indirection objects in a cloud storage system
US9195666B2 (en) Location independent files
US20180060348A1 (en) Method for Replication of Objects in a Cloud Object Store
JP2002501255A (ja) コンテンツアドレス可能な情報のカプセル化、表現、および転送
EP3716581A1 (fr) Système de fichiers mondial pour applications gourmandes en données
Yang et al. A security carving approach for AVI video based on frame size and index
CN116010348B (zh) 一种分布式海量对象的管理方法和装置
CN112866406A (zh) 一种数据存储方法、系统、装置、设备及存储介质
US20060020572A1 (en) Computer, storage system, file management method done by the computer, and program
EP4002143A1 (fr) Stockage d'éléments d'un système de fichiers associés à un instantané suivi en versions d'un système de fichiers basé sur un répertoire sur un système de stockage d'objets clés
CN114416676A (zh) 数据处理方法、装置、设备和存储介质
US20170048303A1 (en) On the fly statistical delta differencing engine
US11442892B2 (en) File and data migration to storage system
CN113411364A (zh) 资源获取方法、装置及服务器
US20170337204A1 (en) Differencing engine for moving pictures
CN108763425B (zh) 存储和读取音频文件的方法和装置
CN115905120B (zh) 档案文件管理方法、装置、计算机设备和存储介质
EP4195068A1 (fr) Stockage et récupération d'enregistrements de multimédias dans un magasin d'objets
CN116974998A (zh) 数据文件的更新方法、装置、计算机设备和存储介质
CN117909138A (zh) 文件恢复方法、装置、设备及存储介质
WO2013136584A1 (fr) Système de transfert de données

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20937952

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 05.04.2023)

122 Ep: pct application non-entry in european phase

Ref document number: 20937952

Country of ref document: EP

Kind code of ref document: A1