WO2020093501A1 - 文件存储方法、删除方法、服务器及存储介质 - Google Patents

文件存储方法、删除方法、服务器及存储介质 Download PDF

Info

Publication number
WO2020093501A1
WO2020093501A1 PCT/CN2018/119594 CN2018119594W WO2020093501A1 WO 2020093501 A1 WO2020093501 A1 WO 2020093501A1 CN 2018119594 W CN2018119594 W CN 2018119594W WO 2020093501 A1 WO2020093501 A1 WO 2020093501A1
Authority
WO
WIPO (PCT)
Prior art keywords
file
stored
storage
same
deleted
Prior art date
Application number
PCT/CN2018/119594
Other languages
English (en)
French (fr)
Inventor
赖志阳
Original Assignee
网宿科技股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 网宿科技股份有限公司 filed Critical 网宿科技股份有限公司
Priority to US16/958,670 priority Critical patent/US20200349113A1/en
Priority to EP18939218.6A priority patent/EP3876106A4/en
Publication of WO2020093501A1 publication Critical patent/WO2020093501A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/16File or folder operations, e.g. details of user interfaces specifically adapted to file systems
    • G06F16/162Delete operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/16File or folder operations, e.g. details of user interfaces specifically adapted to file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/16File or folder operations, e.g. details of user interfaces specifically adapted to file systems
    • G06F16/164File meta data generation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/174Redundancy elimination performed by the file system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/174Redundancy elimination performed by the file system
    • G06F16/1748De-duplication implemented within the file system, e.g. based on file segments

Definitions

  • the invention relates to the field of storage technology, in particular to a file storage technology.
  • the purpose of the embodiments of the present invention is to provide a file storage method, a deletion method, a server, and a storage medium to solve the problem of occupying storage space when the same file is repeatedly stored, so that the same file is stored only once during the file storage process. Optimize storage space.
  • embodiments of the present invention provide a file storage method, including the following steps: receiving a file to be stored; detecting whether the same stored file exists in the stored storage file; when there is When the same storage file as the file to be stored is generated, a path pointing to the storage address of the same storage file is generated, and the generated path is saved as the file to be stored.
  • Embodiments of the present invention also provide a file deletion method, including the following steps: receiving a file deletion instruction; if the file to be deleted is a file stored in a path manner, deleting the stored path.
  • An embodiment of the present invention further provides a server, including: at least one processor; and a memory communicatively connected to the at least one processor; wherein the memory stores instructions executable by the at least one processor, and the instructions are at least one The processor executes, so that at least one processor can execute the above-mentioned file storage method, or execute the above-mentioned file deletion method.
  • Embodiments of the present invention also provide a computer-readable storage medium that stores a computer program, and when the computer program is executed by a processor, the above file storage method or the above file deletion method is executed.
  • the embodiment of the present invention receives the file to be stored, and first detects whether there is the same storage file as the file to be stored in the stored storage file.
  • the same storage file as the storage file Generate a path that points to the same storage address of the stored file, and save the generated path as the file to be stored, that is, for files that have been repeatedly stored, only the path that points to the storage address of the same stored file is saved.
  • the user can also save the path Access to the file to be stored greatly reduces the occupation of storage space, improves the utilization of storage space, and the evolution from storage file to storage path does not require additional operations by the user. It is simple to implement, highly practical, and will not Too much cost.
  • detect whether there is the same storage file as the file to be stored in the stored storage file specifically: calculate the message summary of the file to be stored; detect whether there is the same as the file to be stored in the stored storage file
  • the storage file of the message digest if there is no storage file with the same message digest, it is determined that there is no same storage file as the file to be stored; if there is a storage file with the same message digest, the content of the file to be stored is the same as that with the same
  • the contents of the stored files of the message digest are compared. If the comparison result is the same, it is determined that there is the same storage file as the file to be stored; if the comparison result is different, it is determined that there is no same storage file as the file to be stored.
  • the message digest is used to detect whether there may be the same storage file as the file to be stored. Since each file can calculate its own fixed message digest, so It can detect most of the same stored files; in addition, considering the situation of message digest collision, that is, the situation that multiple files have the same message digest, and then through the comparison of the file content, it is determined whether there is the same as the file to be stored File, which effectively improves the accuracy of detecting whether the same stored file exists.
  • calculating the message digest of the file to be stored specifically includes: when the size of the file to be stored is less than the preset threshold, directly calculating the message digest of the file to be stored; when the size of the file to be stored is greater than or equal to the preset threshold, Divide the file to be stored according to a preset size, and calculate the message summary of the file to be stored according to the divided data.
  • a method for calculating the message digest of the file to be stored is provided; the calculation of the message digest specifically calculates a characteristic string that can represent the file itself, and when the size of the file to be stored is greater than or equal to a preset threshold, the file is divided by calculation
  • the message digest of the data not only ensures the accuracy of the message digest calculation, but also reduces the calculation pressure when the server performs the above operation.
  • compare the content of the file to be stored with the content of the storage file with the same message digest including: whether the length of the storage file and the storage file with the same message digest are the same; if the length is different, it is determined to be stored
  • the content of the file is different from the content of the storage file with the same message digest; if the length is the same, the storage file and the storage file with the same message digest are divided by binary search method, and the content of each divided part is compared in turn Until the content comparison results are different or all content comparisons are completed.
  • generating a path to the storage address of the same storage file includes: generating a soft link or shortcut of the file to be stored; linking the generated soft link or shortcut to the storage address of the same storage file; generating the The path is saved as a file to be stored, specifically: a soft link or a shortcut to save the file to be stored.
  • the soft link or shortcut is an ordinary file pointed by the storage path, its size is much smaller than the size of the file to be stored, and will not affect the content and attributes of the same stored file pointed to; when the user accesses the path, he can jump to
  • the files to be stored are the same, that is, the soft links or shortcuts will not affect the user's access to the stored files, and at the same time effectively reduce the occupation of storage space, improve the utilization of storage space, and realize the optimization of storage space; and
  • the creation of soft links or shortcuts is simple and does not incur excessive costs.
  • the method when there is no same storage file as the file to be stored, store the file to be stored and generate a location file of the file to be stored.
  • the location file includes a message summary of the file to be stored and a path to the storage address; use the generated path as After the file to be stored is saved, the method further includes: generating a positioning file of the file to be stored.
  • the positioning file includes a message summary of the file to be stored and a file name of the same storage file.
  • the file to be stored can also be saved normally; and after the file to be stored is completed, the generated message summary including the file to be stored and the path to the file to be stored or
  • the location file of the storage address is convenient for quickly understanding the relevant information of the file to be stored, and is helpful for the execution of the operation of deleting the stored file in the future.
  • the file deletion method is applied to the server, where the positioning files of each storage file are stored, and the positioning file is used to store the message summary of the storage file, and the path to the storage address or the file name of the same storage file; file deletion The method further includes: after receiving the file deletion instruction, reading the location file of the file to be deleted; determining whether the file name of the same storage file is stored in the location file of the file to be deleted; if the file of the same storage file is stored Name, it is determined that the file to be deleted is a file stored in a path; the location file of the file to be deleted is deleted. In this way, it is convenient to delete the message digest and link of the file to be deleted, and at the same time, reduce the unnecessary occupation of the server storage space.
  • the server also stores a message digest list, and the message digest list is used to store the file names of the stored files corresponding to the message digests; the server also stores links that correspond to each message digest in the message digest list.
  • a list, a link list is used to store links of at least one storage file, the link includes a source file to which the storage file is linked, and a path or storage address pointing to the storage address of the source file; the file deletion method further includes: deleting the pending file After deleting the positioning file of the file, according to the message summary of the file to be deleted, in the message summary list, delete the file name of the file to be deleted corresponding to the message summary of the file to be deleted; according to the message summary of the file to be deleted, obtain the correspondence with the message summary Link list, delete the link of the file to be deleted in the corresponding link list.
  • the message digest and link storage method for the stored file is provided, and the message digest and link deletion method for the stored file is also provided.
  • deleting the link of the file to be deleted it also includes: judging whether the link to the same source file as the file to be deleted is also stored in the link list corresponding to the message summary; if there is no link to the file to be deleted If the file is linked to the same source file, the storage space occupied by the source file to which the file to be deleted is linked is released. In this way, unnecessary occupation of server storage space is reduced.
  • the source file linked to the file to be deleted after releasing the storage space occupied by the source file linked to the file to be deleted, it also includes: judging whether the file name of the file with the same message digest as the file to be deleted is also stored in the message digest list; No, delete the message digest in the message digest list. In this way, unnecessary occupation of server storage space is reduced.
  • FIG. 1 is a flowchart of a file storage method according to a first embodiment of the present invention
  • FIG. 2 is a flowchart of a file storage method according to a second embodiment of the present invention.
  • FIG. 3 is a flowchart of a file deletion method according to a third embodiment of the present invention.
  • FIG. 4 is a schematic structural diagram of a server according to a fourth embodiment of the present invention.
  • the first embodiment of the present invention relates to a file storage method, and the specific process is shown in FIG. 1.
  • the same file is stored only once during the file storage process to optimize the storage space.
  • the process of Figure 1 is described in detail below:
  • Step 101 Receive a file to be stored.
  • the user uploads the file to be stored
  • the server receives the file to be stored, and temporarily stores it in a space dedicated to the temporary file in the server; the temporarily stored file to be stored can be normally accessed by the user. In this way, the normal access of the user to the stored file is not affected, and the file to be stored is not directly saved to the storage space, which reduces the occupation of the storage space.
  • step 102 it is detected whether there is the same storage file as the file to be stored in the stored storage file; if yes, step 103 is executed; if not, step 105 is executed.
  • the file to be stored is not repeatedly stored to save and use the storage space; if there is no same storage file as the file to be stored, the storage file is normally saved file.
  • Step 103 Generate a path pointing to the storage address of the same storage file.
  • a path to the storage address of the same storage file is generated, that is, a way for the user to link to the same storage file is provided, which does not affect the user's treatment of the storage file in the future Access; and from the process of generating the path without additional operations by the user, the implementation is simple, practical, and will not incur excessive costs.
  • generating a path that points to the storage address of the same stored file specifically includes: generating a soft link or shortcut of the file to be stored, for example, generating a shortcut of the file to be stored under the windows system, and under the linux system Generate a soft link for the file to be stored.
  • the soft link or shortcut is an ordinary file pointed to by the storage path, and its size is generally 6B, which is much smaller than the size of the file to be stored, and will not affect the contents and attributes of the same stored file pointed to.
  • Step 104 Save the generated path as a file to be stored.
  • the generated path is saved as a file to be stored, specifically: a soft link or a shortcut to save the file to be stored.
  • a soft link or a shortcut to save the file to be stored.
  • Step 105 Store the file to be stored.
  • step 102 when it is determined through step 102 that there is no same storage file as the file to be stored, the file to be stored is normally saved to the storage space.
  • Step 106 Generate a positioning file of the file to be stored.
  • the generated positioning file when the file to be stored is a file stored in a path, the generated positioning file includes a message summary of the file to be stored and the generated path; when the file to be stored is not a file stored in the path, the generated positioning file Including the message summary of the file to be stored and the storage address of the file to be stored. In this way, it is convenient for the user to quickly understand the relevant information of the file to be stored when opening the positioning file, which is helpful for the execution of operations such as deleting the stored file in the future.
  • this embodiment does not save the files that have been repeatedly stored, but only saves the path to the storage address of the same storage file, which effectively realizes the optimization of the storage space.
  • due to the uncontrollability of user behavior it is easy to cause repeated storage of the same file.
  • user A uploaded a material 1 with a size of 5M
  • user B, user C, etc. also uploaded the same material 1, so that N users have uploaded material 1 and stored it repeatedly in the storage space.
  • N copies of material 1 with a size of 5M however, among them, N-1 copies of material 1 belong to redundant storage, which means that a waste of (N-1) * 5M size is brought to the storage space.
  • N users such as user B and user C upload material 1
  • N users such as user B and user C upload material 1
  • the path to the storage address of the stored material 1 is saved, which greatly reduces the occupation of storage space and improves The utilization of storage space and improves The utilization of storage space is realized, and the optimization of storage space is realized; and the evolution from the storage file to the storage path does not require additional operations by the user, which is simple to implement, highly practical, and does not generate excessive costs.
  • the location file of the file to be stored is generated, which is convenient for quickly understanding the relevant information of the file to be stored, and is helpful for the execution of operations such as deleting the file to be stored in the future.
  • the second embodiment of the present invention relates to a file storage method, and the specific process is shown in FIG. 2.
  • the second embodiment is almost the same as the first embodiment, the main difference is that: in the second embodiment of the present invention, how to detect whether there is a file in the stored file that is the same as the file to be stored is carried out Further refinement.
  • Step 201 Receive a file to be stored. This step is the same as step 101 and will not be repeated here.
  • Step 202 Calculate the message digest of the file to be stored.
  • each file has a fixed message digest.
  • the essence of the message digest is a characteristic string composed of several bytes.
  • the aforementioned characteristics can be calculated from a file composed of multiple bytes through a certain calculation. string.
  • This embodiment provides a method for calculating the message digest of the file to be stored: when the size of the file to be stored is less than the preset threshold, it means that the file to be stored has fewer bytes, and the message digest of the file to be stored can be directly calculated; When the size of the file to be stored is greater than or equal to the preset threshold, the file to be stored is divided into several groups of data, and the message digests of the divided groups of data are separately calculated. In this way, it not only ensures the accuracy of the message digest calculation, but also reduces the calculation pressure when the server performs the above operations.
  • step 203 it is detected whether there is a stored file with the same message digest as the file to be stored in the stored file; if so, step 204 is executed; if not, it is determined that there is no same storage file as the file to be stored, step 207 is executed .
  • this step provides a specific implementation method for detecting the same storage file.
  • detecting the message digest it is determined whether there may be the same storage file as the file to be stored: since each file can calculate its own fixed Message digest, so detecting whether there is a storage file with the same message digest as the file to be stored can detect most of the same storage file.
  • Step 204 compare whether the content of the storage file and the storage file with the same message digest are the same; if so, determine that there is the same storage file as the file to be stored, perform step 205; if not, determine that the content of the file to be stored has the same message
  • step 207 is executed.
  • the situation of message digest collision must also be considered, that is, the case where multiple files have the same message digest: Since the message digest algorithm is from multiple bytes The characteristic string composed of several bytes is calculated in the composed file. For files exceeding a certain byte, the calculated characteristic string is a subset, so there must be two or more different files with The same feature string. At this time, comparing the content of the file to be stored with the file with the same message digest effectively improves the accuracy of detecting whether the same storage file exists.
  • the above comparison process is executed in the background, does not block the main thread, does not affect the normal use of the server, that is, effectively uses the resources of the server.
  • Step 205 Generate a path pointing to the storage address of the same storage file. This step is the same as step 103 and will not be repeated here.
  • Step 206 Save the generated path as a file to be stored. This step is the same as step 104 and will not be repeated here.
  • Step 207 Store the file to be stored. This step is the same as step 105 and will not be repeated here.
  • Step 208 Generate a positioning file of the file to be stored. This step is the same as step 106 and will not be repeated here.
  • the stored file contains the same storage file as the file to be stored, and the file to be stored does not need to be stored repeatedly ; If there is no storage file with the same content as the file to be stored, it means that there is no storage file with the file to be stored, and the file to be stored will be saved normally. In this way, the same storage file as the file to be stored is accurately determined in order to determine the storage method of the file to be stored.
  • a message summary list is stored in the server, and the message summary list is used to store the message summary of each stored file and the file name of each stored file, and the message summary of each stored file corresponds to the file name;
  • Each message digest has a one-to-one corresponding link list.
  • the link list is used to store links of at least one storage file.
  • the link includes a source file to which the storage file is linked, and a path or storage address pointing to the storage address of the source file.
  • the server receives the file A to be stored, and renames the file A to be stored as the file to be stored xA to ensure the uniqueness of the file name; a combination of time stamp and random value may also be used as the file name for the file to be stored. Temporarily store the file xA to be stored in a space dedicated to the temporary file in the server.
  • calculate the message digest of the file xA to be stored for example, using the MD5 (Message Digest Algorithm 5, version 5 of the message digest algorithm) algorithm, calculate the MD5 value of the file xA to be stored (the message digest of the file xA to be stored will be described later, (Referred to as AMD5). If the file size of the file to be stored xA is less than the preset threshold of 5M, AMD5 is calculated directly.
  • MD5 Message Digest Algorithm 5, version 5 of the message digest algorithm
  • the file xA to be stored is divided into n equal parts of data, and the size of each equal part of the data is 256K; take the first equal part, n / 2 equal parts A total of three data with the nth equal part, and then take the second equal part and the ((n / 2) -1) equal part as the starting data and the end data, and then take the third data to the ((n / 2 ) +1)
  • the equal parts and the (n-1) equal parts are used as the starting data and the ending data, and then take three copies of the data, until finally get 20 equal parts data, and merge the 20 equal parts data in order to calculate and merge
  • the message summary of the data is AMD5.
  • the file xA to be stored is divided into 1000 equal parts of data, and the size of each equal part of the data is 256K. Take a total of three parts of the first, 500, and 1000 equal parts, and then Take the 3rd data of the 2nd, 250th, and 499th parts, then take the 3rd data of the 501th, 750th, and 999th parts ... in turn until you get 20 equal parts
  • the 20 equal parts of data are merged in equal parts order, and a message digest is calculated for the multiple bytes obtained after the merge, as the MD5 value of the file xA to be stored.
  • the method of distributing values is better than the results obtained by continuous values.
  • AMD5 does not exist in the message summary list, it means that there is no same storage file as the file to be stored xA in the stored storage file.
  • AMD5 exists in the message digest list, add a file name xA to the file name corresponding to AMD5 (such as Table 3 and Table 5); from the file name corresponding to AMD5, we know that the file B with the same message digest AMD5 ( As shown in Table 3), obtain the stored file B from the server, and compare whether the contents of the stored file xA and file B are the same. First, match whether the length of file xA and file B to be stored are the same.
  • file xA and file B are inconsistent; if the length is the same, file xA and file B to be stored are divided into n and so on The size of each piece of data is 256K; the data of the first part, n / 2th part, and nth part of the file xA and file B to be stored are sequentially compared, and then respectively The second part and the ((n / 2) -1) part of file xA and file B to be stored are used as the start data and end data, and then three data are taken for sequential comparison, and then the file xA to be stored is respectively Use the ((n / 2) +1) equal parts and (n-1) equal parts of File B as the starting data and ending data, and then take three copies of the data for sequential comparison until all the contents are consistent , It is determined that the content of the file to be stored xA and the file B are consistent. During the comparison, if any time the comparison content is inconsistent, it is determined that the content
  • the file to be stored temporarily stored on the server xA is deleted, a path to the storage address of the file B is generated and named xA, and the generated path xA is used as the file to be stored xA Save to the server; and add the link of the file xA to be stored in the link list corresponding to AMD5, and store it in the form of "B_xA" (as shown in Table 4), where "B_xA" indicates that the file xA to be stored is linked to
  • the source file is file B
  • the path to the storage address of file B is xA
  • the location file xA.links of the file xA to be stored is generated in the storage directory of the file xA to be stored, and in the location file xA.links, write
  • the file names of AMD5 and the same storage file as the file to be stored xA are written in the form of "AMD5_B".
  • the file to be stored xA is stored in the server; in the link list corresponding to AMD5, the storage address of the file to be stored xA is newly added in the form of "xA_xA" (such as Table 6), where "xA_xA" indicates that the source file to which the file to be stored xA is linked is the file to be stored xA itself, and the storage address of the file to be stored xA is xA; at the same time, the file to be stored is generated under the storage directory of the file xA to be stored The location file xA.links of the storage file xA, in the location file xA.links, write AMD5 and the path to the storage address of the file xA to be stored in the form of "AMD5_xA".
  • xA_xA such as Table 6
  • this embodiment uses different methods to calculate the message digest of the file to be stored according to the size of the file to be stored, ensuring the accuracy of the message digest calculation and reducing the calculation pressure when the server performs the above operation; Compare whether the length and content of the storage file and the storage file with the same message digest are the same to determine whether there is the same storage file as the file to be stored. The accuracy of detecting whether the same stored file exists does not affect the normal use of the server.
  • the third embodiment of the present invention relates to a file deletion method.
  • the specific process is shown in Figure 3.
  • a positioning file for each storage file is stored in the server, and the positioning file is used to store the message summary of the storage file, and the path to the storage address or the file name of the same storage file;
  • the server also stores a message
  • the summary list, the message summary list is used to store the message summary and the file name of the storage file corresponding to each message summary, and the message summary of each storage file corresponds to the file name;
  • the server also stores each message summary in the message summary list one by one Corresponding link list.
  • the link list is used to store links of at least one storage file.
  • the link includes a source file to which the storage file is linked, and a path or storage address pointing to a storage address of the source file.
  • a method for deleting files stored in the form of a path is provided; a method for storing message summaries and links for stored files is also provided to facilitate quick search and deletion of the links of stored files;
  • During the process, whether the message digest is deleted is judged, which effectively reduces the server space occupied by useless data.
  • Step 301 Receive a file deletion instruction.
  • receiving an instruction issued by a user to delete a file where the file that can be deleted by the user is a file uploaded and stored by the user.
  • Step 302 Read the location file of the file to be deleted.
  • the .links location file of the file to be deleted in the storage directory of the file to be deleted obtain the message summary of the file to be deleted from the content of the location file, and point to the storage address
  • the path or the file name of the same stored file which is convenient for deleting the message summary and link of the file to be deleted; if the file name of the same stored file is stored in the positioning file, the file to be deleted is determined to be a file stored by path ; If the path to the storage address is stored in the location file, it is determined that the file to be deleted is not a file stored by path. Then, delete the location file of the file to be deleted to reduce unnecessary occupation of the server storage space.
  • Step 303 Delete the file name of the file to be deleted in the message summary list.
  • the file name of the file to be deleted is deleted from the message digest list according to the obtained message digest of the file to be deleted.
  • Step 304 Delete the link of the file to be deleted in the link list.
  • a link list corresponding to the message digest of the file to be deleted is obtained; in the corresponding link list, the link of the file to be deleted is deleted.
  • step 305 it is determined whether the file to be deleted is a file stored in a path, and if so, step 306 is executed, and if not, step 307 is executed.
  • the path to the storage address or the file name of the same storage file is obtained from the content of the location file. Therefore, if the file of the same storage file is stored in the location file Name, it is determined that the file to be deleted is a file stored in a path manner, and step 306 is performed; if the path to the storage address is stored in the location file, it is determined that the file to be deleted is not a file stored in the path manner, and step 307 is performed.
  • Step 306 delete the stored path.
  • the file to be deleted is a file stored in a path
  • deleting the stored path that is, deleting the file to be deleted.
  • step 307 it is determined whether there is still a link in the link list that links to the same source file as the file to be deleted, and if so, it ends; if not, step 308 is executed.
  • the link list corresponding to the message summary of the file to be deleted it is judged whether a link to the same source file as the file to be deleted is also stored, and if so, it indicates that there are multiple files from different sources that are linked to be deleted
  • the source file to which the file is linked the source file to which the file to be deleted is linked is useful data and needs to be retained; if not, it means that the link stored in the link list when the same source file was stored has also been deleted, and the file to be deleted is linked to The source file is useless data and needs to be deleted.
  • step 308 the storage space occupied by the source file to which the file to be deleted is linked is released.
  • step 309 it is determined whether a file name having the same message digest as the file to be deleted is stored in the message digest list, and if so, it is ended; if not, step 310 is performed.
  • the message digest list is used to store the message digest of each stored file and the file name of each stored file
  • the message digest of each stored file corresponds to the file name, so if there are other stored messages with the same message digest as the file to be deleted
  • the file name of the file indicates that there are multiple different files with the same message digest.
  • the message digest belongs to useful data and needs to be retained; if there is no other file name with the same message digest as the file to be stored, the message The summary is useless data and needs to be deleted.
  • Step 310 Delete the message digest in the message digest list.
  • After receiving the delete instruction of the file xA go to the storage directory of the file to be deleted xA and read the location file xA.links.
  • the content of the location file is "AMD5_B", indicating that the message summary of the file xA to be deleted is AMD5 and the file xA
  • file B that is, the file to be deleted xA is a file stored in a path manner.
  • AMD5 delete the file name xA corresponding to AMD5 in the message summary list (as shown in Table 7).
  • According to AMD5 obtain the link list corresponding to AMD5, and delete the "B_xA" link in the link list of AMD5 (as shown in Table 8).
  • delete the stored path xA in the server that is, delete the file xA to be deleted.
  • the B_B and B_xB items are also stored in the corresponding link list of AMD5, and the same storage file B is retained.
  • After receiving the delete instruction for the file xC go to the storage directory of the file to be deleted xC to read the location file xC.links, the content of the location file is "AMD5_B", indicating that the message summary of the file to be deleted xC is AMD5, the same as the file to be deleted
  • the stored file is file B, that is, the file to be deleted xC is a file stored in a path. According to AMD5, delete the file name xC corresponding to AMD5 in the message summary list (as shown in Table 9).
  • AMD5 obtain the link list corresponding to AMD5, delete the "B_xC" link in the link list of AMD5 (as shown in Table 10), and learn that the source file to which the file to be deleted xC is linked is file B. Then delete the stored path xC in the server, that is, delete the file xC to be deleted. At this time, there is no other link to the source file B in the link list corresponding to AMD5, so the storage space occupied by the source file B is released on the server. At this time, in the message summary list, the file name corresponding to AMD5 is only B, and file B has been deleted, so the message summary AMD5 is deleted in the message summary list (as shown in Table 11).
  • the following is another specific example.
  • After receiving the delete instruction for file D go to the storage directory of file D to be deleted and read the location file D.links.
  • the content of the location file is "DMD5_D", indicating that the message summary of file D to be deleted is DMD5, pointing to the file D to be deleted.
  • the path of the storage address is D, that is, the file D to be deleted is not a file stored by path.
  • DMD5 delete the file name D corresponding to DMD5 in the message digest list (as shown in Table 12).
  • the storage space occupied by the file D to be deleted is released on the server, that is, the file D to be deleted is deleted.
  • the file D to be deleted is deleted.
  • there is no other link to the file D to be deleted in the link list corresponding to the DMD5 so the storage space occupied by the file D to be deleted needs to be released.
  • the message summary list the file name corresponding to DMD5 still has D1 and D2, and the message summary DMD5 is retained.
  • a message digest list and a link list corresponding to each message digest in the message digest list are stored in the server, and the message digest and the link storage method of each stored file are provided;
  • the location file of the stored file, the message summary and the link are correspondingly connected, which is convenient for the quick search and deletion of the link of the stored file.
  • a specific implementation method for deleting files is provided, which increases the feasibility of this embodiment, and in the process of deleting files, judges whether message digests and the like need to be deleted, effectively reducing the occupation of server space by useless data.
  • the fourth embodiment of the present invention relates to a server, as shown in FIG. 4, including at least one processor 402; and, a memory 401 communicatively connected to the at least one processor 402; wherein, the memory 401 stores at least one processor Instruction executed by 402, the instruction is executed by at least one processor 402, so that at least one processor 402 can execute the above file storage method, or execute the above file deletion method.
  • the memory 401 and the processor 402 are connected by a bus.
  • the bus may include any number of interconnected buses and bridges.
  • the bus connects one or more processors 402 and various circuits of the memory 401 together.
  • the bus can also connect various other circuits such as peripheral devices, voltage regulators, and power management circuits, etc., which are well known in the art, and therefore, they will not be described further herein.
  • the bus interface provides an interface between the bus and the transceiver.
  • the transceiver can be a single element or multiple elements, such as multiple receivers and transmitters, providing a unit for communicating with various other devices on the transmission medium.
  • the data processed by the processor 402 is transmitted on the wireless medium through the antenna. Further, the antenna also receives the data and transmits the data to the processor 402.
  • the processor 402 is responsible for managing the bus and general processing, and can also provide various functions, including timing, peripheral interfaces, voltage regulation, power management, and other control functions.
  • the memory 401 can be used to store data used by the processor 402 when performing operations.
  • modules involved in this embodiment are all logical modules.
  • a logical unit may be a physical unit or a part of a physical unit, or multiple physical The combination of units is realized.
  • this embodiment does not introduce units that are not closely related to solving the technical problems proposed by the present invention, but this does not mean that there are no other units in this embodiment.
  • the sixth embodiment of the present invention relates to a computer-readable storage medium that stores a computer program.
  • the computer program is executed by the processor 402
  • the foregoing file storage method embodiment or the foregoing file deletion method embodiment is implemented.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or optical disk and other media that can store program code .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本发明实施例涉及存储技术领域,公开了一种文件存储方法、删除方法、服务器及存储介质。一种文件存储方法,包括:接收待存储文件;检测在已存储的存储文件中,是否存在与待存储文件相同的存储文件;当存在与待存储文件相同的存储文件时,生成指向相同的存储文件的存储地址的路径,并将生成的路径作为待存储文件进行保存。采用本发明的实施方式,使得文件存储过程中,对同一份文件只存储一次,以实现对存储空间的优化。

Description

文件存储方法、删除方法、服务器及存储介质 技术领域
本发明涉及存储技术领域,特别涉及一种文件存储技术。
背景技术
随着互联网高速发展,产生了诸多新型存储设备,如文件资源服务器等,通常是多个用户共用一个存储空间,每个用户都可拥有自己的空间用于存储文件;随着时间增长,存储空间的消耗会不断增大。
然而发明人发现现有技术中存在如下问题:由于用户行为的不可控性,当多个用户分别存储了同样的文件时,会造成相同文件的重复存储,占用了不必要的存储空间,浪费了有限的存储资源;若通过新增文件资源服务器等来解决文件重复存储占用存储空间的问题,则会产生巨大的成本。
发明内容
本发明实施方式的目的在于提供一种文件存储方法、删除方法、服务器及存储介质,以解决相同文件重复存储时占用存储空间的问题,使得文件存储过程中,对同一份文件只存储一次,以实现对存储空间的优化。
为解决上述技术问题,本发明的实施方式提供了一种文件存储方法,包括以下步骤:接收待存储文件;检测在已存储的存储文件中,是否存在与待存储文件相同的存储文件;当存在与待存储文件相同的存储文件时,生成指向相同的存储文件的存储地址的路径,并将生成的路径作为待存储文件进行保存。
本发明的实施方式还提供了一种文件删除方法,包括以下步骤:接收文件的删除指令;若待删除文件为以路径方式存储的文件,则删除存储的路径。
本发明的实施方式还提供了一种服务器,包括:至少一个处理器;以及,与至少一个处理器通信连接的存储器;其中,存储器存储有可被至少一个处理器执行的指令,指令被至少一个处理器执行,以使至少一个处理器能够执行上述的文件存储方法,或者,执行上述的文件删除方法。
本发明的实施方式还提供了一种计算机可读存储介质,存储有计算机程序,计算机程序被处理器执行时上述文件存储方法,或者,执行上述文件删除方法。
本发明实施方式相对于现有技术而言,接收待存储的文件,首先检测在已存储的存储文件中,是否存在与待存储文件相同的存储文件,当存在与存储文件相同的存储文件时,生成指向相同的存储文件的存储地址的路径,并将生成的路径作为待存储文件保存,即对重复存储过的文件,只保存指向相同存储文件的存储地址的路径,用户通过保存的路径同样可以访问到待存储文件,极大的减少了对存储空间的占用,提升了存储空间的利用率,且从存储文件至存储路径的演变无需用户进行额外的操作,实现简单,实用性高,不会产生过多成本。
另外,检测在已存储的存储文件中,是否存在与待存储文件相同的存储文件,具体为:计算待存储文件的消息摘要;检测在已存储的存储文件中,是否存在与待存储文件具有相同消息摘要的存储文件;如不存在具有相同消息摘要的存储文件,则判定不存在与待存储文件相同的存储文件;若存在具有相同消息摘要的存储文件,则将待存储文件的内容与具有相同消息摘要的存储文件的内容进行比对,若比对结果相同,则判定存在与待存储文件相同的存储文件;若比对结果不相同,则判定不存在与待存储文件相同的存储文件。这样,提供了一种检测相同的存储文件的具体实现方法,首先通过消息摘要检测是否可能存在与待存储文件相同的存储文件,由于每个文件都可以计算出自身的一个固定的消息摘要,因此可以对实现绝大部分相同的存储文件的检测;另外考虑到消息摘要碰撞的情况,即多个文件具有相同消息摘要的情况,再通过对文件内容的比对,判断是否存在与待存储文件相同的文件,有效的提升了检测是否存在相同的存储文件的准确性。
另外,计算待存储文件的消息摘要,具体包括:当待存储文件的大小,小于预设门限时,直接计算待存储文件的消息摘要;当待存储文件的大小,大于或等于预设门限时,将待存储文件按预设大小进行划分,根据划分后的数据计算待存储文件的消息摘要。这样,提供了一种计算待存储文件的消息摘要的方法;消息摘要的计算具体为计算出可代表文件本身的特征串,当待存储文件大小大于或等于预设门限时,通过计算划分的文件的数据的消息摘要,既保证了对消息摘要计算的准确性,也减轻服务器执行上述操作时的计算压力。
另外,将待存储文件的内容与具有相同消息摘要的存储文件的内容进行比对,具体包括:比对待存储文件与具有相同消息摘要的存储文件的长度是否相同;若长度不同,则判定待存储文件的内容与具有相同消息摘要的存储文件的内容不同;若长度相同,则以二分查找法分别对待存储文件与具有相同消息摘要的存储文件进行划分,依次比对每一划分部分的内容是否相同,直至内容的比对结果不同或完成所有内容的比对。这样,首先比对待存储文件与具有相同消息摘要的存储文件的长度是否相同:由于文件的长度不同时文件的内容一定不相同,预先比对文件长度有效减轻了服务器执行上述操作时的工作压力;当长度相同时,再比对文 件内容,以二分查找法分别对划分后的文件部分内容进行比对,提升了比对效率和准确性;综上,有效避免了消息摘要发生碰撞时的误判相同的存储文件的情况,有效提升了检测是否存在相同的存储文件的准确性。
另外,生成指向相同的存储文件的存储地址的路径,具体包括:生成待存储文件的软链接或快捷方式;将生成的软链接或快捷方式链接到与相同的存储文件的存储地址;将生成的路径作为待存储文件进行保存,具体为:保存待存储文件的软链接或快捷方式。软链接或快捷方式为存放路径指向的普通文件,其大小远小于待存储文件的大小,且不会影响被指向的相同的存储文件的内容和属性;当用户访问路径时即可跳转到与待存储文件相同的文件,即软链接或快捷方式不会影响用户对待存储文件的访问,同时有效减少了对存储空间的占用,提升了存储空间的利用率,实现了对存储空间的优化;且软链接或快捷方式的生成简单,不会产生过多成本。
另外,当不存在与待存储文件相同的存储文件时,存储待存储文件,并生成待存储文件的定位文件,定位文件包括待存储文件的消息摘要和指向存储地址的路径;将生成的路径作为待存储文件进行保存后,还包括:生成待存储文件的定位文件,定位文件包括待存储文件的消息摘要和相同的存储文件的文件名。这样,当不存在与待存储文件相同的存储文件时,待存储文件也可正常的被保存;且完成待存储文件的保存后,生成的包括待存储文件的消息摘要和待存储文件的路径或存储地址的定位文件,便于快速的了解待存储文件的相关信息,有助于日后对待存储文件的删除等操作的执行。
另外,文件删除方法应用于服务器,服务器中存储有各存储文件的定位文件,定位文件用于存储所述存储文件的消息摘要,以及指向存储地址的路径或相同的存储文件的文件名;文件删除方法还包括:在接收文件的删除指令后,读取待删除文件的定位文件;判断待删除文件的定位文件中,是否存储有相同的存储文件的文件名;如果存储有相同的存储文件的文件名,则判定待删除文件为以路径方式存储的文件;删除待删除文件的定位文件。通过这种方式,提便于对待删除文件的消息摘要和链接等进行删除操作,同时减少了对服务器存储空间的不必要的占用。
另外,服务器中还存储有消息摘要列表,消息摘要列表用于存储消息摘要与各消息摘要对应的存储文件的文件名;服务器中还存储有与消息摘要列表中的各消息摘要一一对应的链接列表,链接列表用于存储至少一个存储文件的链接,链接包括所述存储文件链接到的源文件,以及指向源文件的存储地址的路径或存储地址;文件删除方法还包括:在删除所述待删除文件的定位文件后,根据待删除文件的消息摘要,在消息摘要列表中,删除待删除文件的 消息摘要对应的待删除文件的文件名;根据待删除文件的消息摘要,获取与消息摘要对应的链接列表,在对应的链接列表中删除待删除文件的链接。通过这种方式,提供了对存储文件的消息摘要和链接的存储方式,还提供了对存储文件的消息摘要和链接的删除方式。
另外,在删除待删除文件的链接后,还包括:判断在消息摘要对应的链接列表中,是否还存储有与待删除文件链接到相同的源文件的链接;若未存储有与所述待删除文件链接到相同的源文件的链接,则释放待删除文件链接到的源文件所占用的存储空间。通过这种方式,减少了对服务器存储空间的不必要的占用。
另外,在释放所述待删除文件链接到的源文件所占用的存储空间后,还包括:判断在消息摘要列表中,是否还存储有与待删除文件具有相同消息摘要的文件的文件名;若否,在消息摘要列表中删除消息摘要。通过这种方式,减少了对服务器存储空间的不必要的占用。
附图说明
一个或多个实施例通过与之对应的附图中的图片进行示例性说明,这些示例性说明并不构成对实施例的限定。
图1是根据本发明第一实施方式的文件存储方法的流程图;
图2是根据本发明第二实施方式的文件存储方法的流程图;
图3是根据本发明第三实施方式的文件删除方法的流程图;
图4是根据本发明第四实施方式的服务器的结构示意图。
具体实施方式
为使本发明实施例的目的、技术方案和优点更加清楚,下面将结合附图对本发明的各实施方式进行详细的阐述。然而,本领域的普通技术人员可以理解,在本发明各实施方式中,为了使读者更好地理解,本申请而提出了许多技术细节,但不限制其具体的实现形式。即使没有这些技术细节和基于以下各实施方式的种种变化和修改,也可以实现本申请所要求保护的技术方案。
本发明的第一实施方式涉及一种文件存储方法,具体流程如图1所示。本实施方式中,在文件存储过程中,对同一份文件只存储一次,以实现对存储空间的优化。下面对图1的流程做具体说明:
步骤101,接收待存储文件。
具体地说,由用户上传待存储的文件,服务器接收待存储文件,并暂存至服务器中一块专用于暂存文件的空间;暂存的待存储文件可被用户正常访问。通过这种方式,不影响用户对待存储文件的正常访问,未将待存储文件直接保存至存储空间,减少了对存储空间的占用。
步骤102,检测在已存储的存储文件中,是否存在与待存储文件相同的存储文件;若是,执行步骤103;若否,执行步骤105。
具体地说,若存在与待存储文件相同的存储文件,则不重复存储待存储文件,以实现对存储空间的节省和利用;若不存在与待存储文件相同的存储文件,则正常保存待存储文件。
更具体地说,当接收待存储文件后,可利用消息队列的方式,通知服务器执行文件重复性检测的任务,即当接收到待存储文件后,才开始检测在已存储的存储文件中,是否存在与待存储文件相同的存储文件。通过这种方式,更有效的利用了服务器的资源。
步骤103,生成指向相同的存储文件的存储地址的路径。
具体地说,当存在与待存储文件相同的存储文件时,生成指向相同的存储文件的存储地址的路径,即提供一种供用户链接到相同的存储文件的方式,不影响用户日后对待存储文件的访问;且从生成路径的过程无需用户进行额外的操作,实现简单,实用性高,不会产生过多成本。
更具体地说,生成指向相同的存储文件的存储地址的路径,具体包括:生成待存储文件的软链接或快捷方式,例如,在windows系统下生成待存储文件的快捷方式,在linux系统下则生成待存储文件的软链接。软链接或快捷方式为存放路径指向的普通文件,其大小一般为6B,远小于待存储文件的大小,也不会影响被指向的相同的存储文件的内容和属性。
步骤104,将生成的路径作为待存储文件进行保存。
具体地说,不保存重复存储过的文件,只保存指向相同存储文件的存储地址的路径,极大的减少了对存储空间的占用,提升了存储空间的利用率;并且当待存储文件的大小越大时,对存储空间的节省效果越好,对存储空间的利用率越高。例如,由于生成的路径(软链接或快捷方式)的大小一般为6B,当待存储文件大小为5M时,将生成的路径作为待存储文件进行保存,使得存储空间的占用得到了5*1024*1024/6B的优化。
更具体地说,将生成的路径作为待存储文件进行保存,具体为:保存待存储文件的软链接或快捷方式。当用户访问路径时即可跳转到与待存储文件相同的文件,不会影响用户对待存储文件的访问,同时有效减少了对存储空间的占用,提升了存储空间的利用率,实现了对存储空间的优化。
步骤105,存储待存储文件。
具体地说,当经过步骤102判定不存在与待存储文件相同的存储文件时,正常将待存储文件保存至存储空间。
步骤106,生成待存储文件的定位文件。
具体地说,当待存储文件为以路径方式存储的文件时,生成的定位文件包括待存储文件的消息摘要和生成的路径;当待存储文件不是以路径方式存储的文件时,生成的定位文件包括待存储文件的消息摘要和待存储文件的存储地址。通过这种方式,便于用户在打开定位文件时令快速的了解待存储文件的相关信息,有助于日后对待存储文件的删除等操作的执行。
本实施方式相对于现有技术而言,不保存重复存储过的文件,只保存指向相同存储文件的存储地址的路径,有效实现了对存储空间的优化。现有技术中,由于用户行为的不可控性,易造成对相同文件的重复存储。例如,用户A上传了一份大小为5M的材料1,用户B、用户C等也上传了同样的一份材料1,这样有N个用户都上传了材料1,在存储空间中就重复存储了N份大小为5M的材料1,然而其中N-1份的材料1属于冗余存储,也就是说给存储空间带来了(N-1)*5M大小的浪费。而本实施方式中,当用户B、用户C等N个用户都上传了材料1时,只保存指向已存储的材料1的存储地址的路径,这样极大的减少了对存储空间的占用,提升了存储空间的利用率,实现了对存储空间的优化;且从存储文件至存储路径的演变无需用户进行额外的操作,实现简单,实用性高,不会产生过多成本。在完成待存储文件的保存后生成待存储文件的定位文件,便于快速的了解待存储文件的相关信息,有助于日后对待存储文件的删除等操作的执行。
本发明的第二实施方式涉及一种文件存储方法,具体流程如图2所示。第二实施方式与第一实施方式大致相同,主要区别之处在于:在本发明第二实施方式中,对如何检测在已存储的存储文件中,是否存在与待存储文件相同的文件,进行了进一步的细化。下面对图2的流程做具体说明:
步骤201,接收待存储文件。本步骤与步骤101相同,此处不再赘述。
步骤202,计算待存储文件的消息摘要。
具体地说,每个文件都具有一固定的消息摘要,消息摘要的本质是由若干个字节构成的特征串,可以通过一定的计算从由多个字节组成的文件中计算出前述的特征串。本实施方式提供了计算待存储文件的消息摘要的方法:当待存储文件的大小,小于预设门限时,说明待存储文件的字节数较少,可直接计算待存储文件的消息摘要;当待存储文件大小,大于或等于预设门限时,将待存储文件划分为若干组数据,分别计算划分后的若干组数据的的消息摘 要。通过这种方式,既保证了对消息摘要计算的准确性,也减轻服务器执行上述操作时的计算压力。
步骤203,检测在已存储的存储文件中,是否存在与待存储文件具有相同消息摘要的存储文件;若是,执行步骤204;若否,判定不存在与待存储文件相同的存储文件,执行步骤207。
具体地说,本步骤提供了一种检测相同的存储文件的具体实现方法,首先通过检测消息摘要判断是否可能存在与待存储文件相同的存储文件:由于每个文件都可以计算出自身的一固定的消息摘要,因此检测是否存在与待存储文件具有相同消息摘要的存储文件,可以实现对绝大部分相同的存储文件的检测。当不存在与待存储文件具有相同消息摘要的文件时,即不存在内容与待存储文件相同的文件,则判定不存在与待存储文件相同的存储文件。
步骤204,比对待存储文件与具有相同消息摘要的存储文件的内容是否相同;若是,判定存在与待存储文件相同的存储文件,执行步骤205;若否,判定待存储文件的内容与具有相同消息摘要的存储文件的内容不同,执行步骤207。
具体地说,当存在与待存储文件具有相同消息摘要的存储文件后,还要考虑到消息摘要碰撞的情况,即多个文件具有相同消息摘要的情况:由于消息摘要算法是从多个字节组成的文件中计算出由若干个字节构成的的特征串,对于超过一定字节的文件来说,计算出的特征串是一个子集,所以必然存在两个或两个以上不同的文件具有相同的特征串。此时,对待存储文件和具有相同消息摘要的存储文件进行文件内容的比对,有效的提升了检测是否存在相同的存储文件的准确性。
更具体地说,比对待存储文件与具有相同消息摘要的存储文件的内容是否相同时,首先比对待存储文件与具有相同消息摘要的存储文件的长度是否相同:由于待存储文件与具有相同消息摘要的存储文件的长度不同时,其文件的内容一定不相同,所以预先比对文件长度有效减轻了服务器执行上述操作时的工作压力;当长度相同时,再对文件内容进行比对,具体为:以二分查找法分别对待存储文件与具有相同消息摘要的存储文件进行划分,依次比对每一划分部分的内容,通过这种方式,有顺序有条理的文件的内容进行比对,且每次比对的每一划分部分的大小在一定范围内,提升了比对效率和准确性。通过这种比对方式,有效避免了消息摘要发生碰撞时的误判相同的存储文件的情况,有效提升了检测是否存在相同的存储文件的准确性。
更具体地说,上述比对过程均在后台执行,不阻塞主线程,不影响服务器的正常使用,即有效利用了服务器的资源。
步骤205,生成指向相同的存储文件的存储地址的路径。本步骤与步骤103相同,此处 不再赘述。
步骤206,将生成的路径作为待存储文件进行保存。本步骤与步骤104相同,此处不再赘述。
步骤207,存储待存储文件。本步骤与步骤105相同,此处不再赘述。
步骤208,生成待存储文件的定位文件。本步骤与步骤106相同,此处不再赘述。
本实施方式中,若存在与待存储文件具有相同消息摘要、相同长度、相同内容的存储文件,则说明已存储的存储文件中,存在与待存储文件相同的存储文件,待存储文件无需重复存储;若不存在与待存储文件具有相同内容的存储文件,则说明不存在与待存储文件相同的存储文件,待存储文件将被正常保存。通过这种方式,准确的确定了与待存储文件相同的存储文件,以便于决定待存储文件的存储方式。
下面以一个实例进行具体说明:
服务器中存储有消息摘要列表,消息摘要列表用于存储各存储文件的消息摘要与各存储文件的文件名,各存储文件的消息摘要与文件名对应;服务器中还存储有与消息摘要列表中的各消息摘要一一对应的链接列表,链接列表用于存储至少一个存储文件的链接,链接包括存储文件链接到的源文件,以及指向源文件的存储地址的路径或存储地址。
服务器接收待存储文件A,并将待存储文件A重命名为待存储文件xA,以保证文件名称的唯一性;也可采用时间戳和随机值的组合作为待存储文件的重命文件名。将待存储文件xA暂存至服务器中一块专用于暂存文件的空间。
随后,计算待存储文件xA的消息摘要,例如,以MD5(Message Digest Algorithm 5,消息摘要算法第五版)算法,计算待存储文件xA的MD5值(后文将待存储文件xA的消息摘要,简称为AMD5)。若待存储文件xA的文件大小小于预设门限5M时,直接计算AMD5。若待存储文件xA的文件大小大于或等于预设门限5M时,将待存储文件xA划分为n等份数据,每等份数据的大小为256K;取第1等份、第n/2等份和第n等份共三份数据,然后以第2等份和第((n/2)-1)等份作为起始数据和结束数据,再取三份数据,以第((n/2)+1)等份和第(n-1)等份作为起始数据和结束数据,再取三份数据,直至最后得到20份等份数据,将20份等份数据按序合并后计算合并数据的消息摘要作为AMD5。下面以一实例进行具体说明,将待存储文件xA划分为1000等份数据,每等份数据的大小为256K,取第1等份、第500等份和第1000等份共三份数据,再取第2等份、第250等份、第499等份共三份数据,再取第501等份、第750等份和第999等份共三份数据……依次直至取到20份等份数据,将这20等份数据按等份数顺序合并,对合并后得到的多个字节计算出消息摘要,作为待存储文件xA的 MD5值。通过分散取值的方式,较优于连续取值方式得到的结果。
随后,从消息摘要列表中,查找是否存在AMD5。
若消息摘要列表中不存在AMD5,则说明在已存储的存储文件中,不存在与待存储文件xA相同的存储文件。在服务器中存储待存储文件xA;在消息摘要列表中,新增一项消息摘要AMD5和对应的文件名xA(如表1所示);并新增AMD5对应的链接列表(即下表2),在AMD5对应的链接列表中存储待存储文件xA的链接,以“xA_xA”的形式存入(如表2所示),其中,“xA_xA”表示待存储文件xA的源文件为待存储文件xA本身,待存储文件xA的存储地址为xA;同时在待存储文件xA的存储目录下生成待存储文件xA的定位文件xA.links,在定位文件xA.links中,写入AMD5和指向待存储文件xA的存储地址的路径,以“AMD5_xA”的形式写入。
表1消息摘要列表1-1
消息摘要 文件名
AMD5 xA
CMD5 xC
…… ……
表2 AMD5对应的链接列表1-1
xA_xA
若消息摘要列表中存在AMD5,则在AMD5对应的文件名中新增一项文件名xA(如表3和表5);从AMD5对应的文件名中得知具有相同消息摘要AMD5的文件B(如表3),从服务器中获取已存储的文件B,比对待存储文件xA与文件B内容是否相同。首先,匹配待存储文件xA与文件B的长度是否相同,若长度不相同,则判定待存储文件xA与文件B内容不一致;若长度相同,将待存储文件xA与文件B分别都划分为n等份数据,每等份数据的大小为256K;分别取待存储文件xA和文件B的第1等份、第n/2等份和第n等份共三份数据进行依次比对,然后分别以待存储文件xA和文件B的第2等份和第((n/2)-1)等份作为起始数据和结束数据,再取三份数据进行依次比对,然后分别以待存储文件xA和文件B的第((n/2)+1)等份和第(n-1)等份作为起始数据和结束数据,再取三份数据进行依次比对,直至比对了全部内容一致,则判定待存储文件xA与文件B内容一致。比对过程中,有任一次比对内容不一致,都判定待存储文件xA与文件B内容不一致。
若判定待存储文件xA与文件B内容一致,则将暂存至服务器的待存储文件xA删除,生成指向文件B的存储地址的路径并命名为xA,将生成的路径xA作为待存储文件xA进行保存至服务器;并在AMD5对应的链接列表中,新增待存储文件xA的链接,以“B_xA”的形式存入(如表4所示),其中,“B_xA”表示待存储文件xA链接到的源文件为文件B,指向文件B的存储地址的路径为xA;同时在待存储文件xA的存储目录下生成待存储文件xA的定位文件xA.links,在定位文件xA.links中,写入AMD5和与待存储文件xA的相同的存储文件的文件名,以“AMD5_B”的形式写入。
表3消息摘要列表1-2
消息摘要 文件名
AMD5 xA、B
CMD5 xC
…… ……
表4 AMD5对应的链接列表1-2
B_xA
若判定待存储文件xA与文件B内容不一致,则在服务器中存储待存储文件xA;在AMD5对应的链接列表中,新增待存储文件xA的存储地址,以“xA_xA”的形式存入(如表6所示),其中,“xA_xA”表示待存储文件xA链接到的源文件为待存储文件xA本身,待存储文件xA的存储地址为xA;同时在待存储文件xA的存储目录下生成待存储文件xA的定位文件xA.links,在定位文件xA.links中,写入AMD5和指向待存储文件xA的存储地址的路径,以“AMD5_xA”的形式写入。
表5消息摘要列表1-3
消息摘要 文件名
AMD5 xA、B
CMD5 xC
…… ……
表6 AMD5对应的链接列表1-3
xA_xA
B_B
B_X
B_Y
….
本实施方式相对于现有技术而言,根据待存储文件的大小采用不同的方式计算待存储文件的消息摘要,保证了对消息摘要计算的准确性,减轻服务器执行上述操作时的计算压力;通过比对待存储文件与具有相同消息摘要的存储文件的长度和内容是否相同来判断是否存在与待存储文件相同的存储文件,有效避免了消息摘要发生碰撞时的误判相同的存储文件的情况,提升了检测是否存在相同的存储文件的准确性,且不影响服务器的正常使用。
本发明的第三实施方式涉及一种文件删除方法。具体流程如图3所示。本实施方式中,服务器中存储有各存储文件的定位文件,定位文件用于存储所述存储文件的消息摘要,以及指向存储地址的路径或相同的存储文件的文件名;服务器中还存储有消息摘要列表,消息摘要列表用于存储消息摘要与各消息摘要对应的存储文件的文件名,各存储文件的消息摘要与文件名对应;服务器中还存储有与消息摘要列表中的各消息摘要一一对应的链接列表,链接列表用于存储至少一个存储文件的链接,链接包括存储文件链接到的源文件,以及指向所述源文件的存储地址的路径或存储地址。本实施方式中,对以路径形式存储的文件提供了删除的方法;还提供了对存储文件的消息摘要和链接的存储方式,便于对存储文件的链接进行快速查找和删除操作;在删除文件的过程中,对消息摘要是否删除进行了判断,有效减少了无用数据对服务器空间的占用。下面对图3的流程做具体说明:
步骤301,接收文件删除指令。
具体地说,接收由用户发出的的删除文件的指令,其中可被用户删除的文件为该用户上传并存储的文件。
步骤302,读取待删除文件的定位文件。
具体地说,在接收到文件删除指令后,在待删除文件的存储目录下读取待删除文件的.links定位文件,从定位文件的内容获取到待删除文件的消息摘要,以及指向存储地址的路径或相同的存储文件的文件名,便于对待删除文件的消息摘要和链接等进行删除操作;若定位文件中存储有相同的存储文件的文件名,则判定待删除文件为以路径方式存储的文件;若定位文件中存储有指向存储地址的路径,则判定待删除文件不是以路径方式存储的文件。随 后,删除待删除文件的定位文件,以减少对服务器存储空间的不必要的占用。
步骤303,在消息摘要列表中删除待删除文件的文件名。
具体地说,由于消息摘要列表中各存储文件的消息摘要与文件名对应,因此根据获取到的待删除文件的消息摘要,在消息摘要列表中删除待删除文件的文件名。
步骤304,在链接列表中删除待删除文件的链接。
具体地说,根据读取定位文件获取到的待删除文件的消息摘要,获取与待删除文件的消息摘要对应的链接列表;在对应的链接列表中,删除待删除文件的链接。
步骤305,判断待删除文件是否为以路径方式存储的文件,若是,执行步骤306,若否,执行步骤307。
具体地说,由于读取待删除文件的定位文件后,从定位文件的内容获取到指向存储地址的路径或相同的存储文件的文件名,因此,若定位文件中存储有相同的存储文件的文件名,则判定待删除文件为以路径方式存储的文件,执行步骤306;若定位文件中存储有指向存储地址的路径,则判定待删除文件不是以路径方式存储的文件,执行步骤307。
步骤306,删除存储的路径。
具体地说,若所述待删除文件为以路径方式存储的文件,则删除存储的所述路径,即删除了待删除文件。
步骤307,判断链接列表中是否还存储有与待删除文件链接到相同的源文件的链接,若是,结束;若否,执行步骤308。
具体地说,在待删除文件的消息摘要对应的链接列表中,判断是否还存储有与待删除文件链接到相同的源文件的链接,若是,说明有多个不同来源的文件都链接到了待删除文件链接到的源文件,待删除文件链接到的源文件属于有用数据,需要保留;若否,说明该相同的源文件在存储时存储至链接列表的链接也已经被删除,待删除文件链接到的源文件属于无用数据,需要被删除。
步骤308,释放待删除文件链接到的源文件所占用的存储空间。
具体地说,由于链接列表中不存在待删除文件链接到的源文件的链接,说明不存在其他文件需要链接到待删除文件链接到的源文件,因此从服务器上删除待删除文件链接到的源文件,释放该源文件所占用的存储空间,减少了对服务器存储空间的不必要的占用。
步骤309,判断消息摘要列表中是否还存储有与待删除文件具有相同消息摘要的文件名,若是,结束;若否,执行步骤310。
具体地说,由于消息摘要列表用于存储各存储文件的消息摘要与各存储文件的文件名, 各存储文件的消息摘要与文件名对应,因此若还存储有其他与待删除文件具有相同消息摘要的文件的文件名,说明有多个不同文件具有相同的该消息摘要,该消息摘要属于有用数据,需要保留;若不存在其他与待存储文件具有相同消息摘要的文件的文件名,说明该消息摘要属于无用数据,需要被删除。
步骤310,在消息摘要列表中删除消息摘要。
具体地说,由于不存在其他与待存储文件具有相同消息摘要的文件的文件名,说明该消息摘要属于无用数据,需要被删除。通过这种方式,有效减少了无用数据对服务器空间的占用。
下面以一实例为具体说明。接收到对文件xA的删除指令,前往待删除文件xA的存储目录读取定位文件xA.links,定位文件的内容为“AMD5_B”,表示待删除文件xA的消息摘要为AMD5,待删除文件xA的相同的存储文件为文件B,即待删除文件xA为以路径方式存储的文件。根据AMD5,在消息摘要列表中删除AMD5对应的文件名xA(如表7所示)。根据AMD5,获取与AMD5对应的链接列表,在AMD5的链接列表中删除“B_xA”一项链接(如表8所示)。随后在服务器中删除存储的路径xA,即删除了待删除文件xA。此时AMD5对应链接列表中,还存储有B_B和B_xB项,则保留相同的存储文件B。
表7消息摘要列表1-4
Figure PCTCN2018119594-appb-000001
表8 AMD5对应的链接列表1-4
Figure PCTCN2018119594-appb-000002
下面以另一实例为具体说明。接收到对文件xC的删除指令,前往待删除文件xC的存储目录读取定位文件xC.links,定位文件内容为“AMD5_B”,表示待删除文件xC的消息摘要为AMD5,待删除文件xC的相同的存储文件为文件B,即待删除文件xC为以路径方式存储的文件。根据AMD5,在消息摘要列表中删除AMD5对应的文件名xC(如表9所示)。根据 AMD5,获取与AMD5对应的链接列表,在AMD5的链接列表中删除“B_xC”一项链接(如表10所示),且得知待删除文件xC链接到的源文件为文件B。随后在服务器中删除存储的路径xC,即删除了待删除文件xC。此时AMD5对应的链接列表中,不存在其他链接到源文件B的链接,因此在服务器中释放源文件B占用的存储空间。此时在消息摘要列表中,AMD5对应的文件名只有B,而文件B已被删除,因此在消息摘要列表中删除消息摘要AMD5(如表11所示)。
表9消息摘要列表1-5-1
Figure PCTCN2018119594-appb-000003
表10 AMD5对应的链接列表1-5
Figure PCTCN2018119594-appb-000004
表11消息摘要列表1-5-2
Figure PCTCN2018119594-appb-000005
下面以另一实例为具体说明。接收到对文件D的删除指令,前往待删除文件D的存储目录读取定位文件D.links,定位文件内容为“DMD5_D”,表示待删除文件D的消息摘要为DMD5,指向待删除文件D的存储地址的路径为D,即待删除文件D不是以路径方式存储的文件。根据DMD5,在消息摘要列表中删除DMD5对应的文件名D(如表12所示)。根据DMD5,获取与DMD5对应的链接列表,在DMD5的链接列表中删除“D_D”一项链接(如表13所示),且得知待删除文件D连接到的源文件为文件D。随后在服务器中释放待删除文件D占用的存储空间,即删除了待删除文件D。此时DMD5对应的链接列表中,不存在其他链接到待删除文件D的链接,因此待删除文件D占用的存储空间需要被释放。此时在消息摘要列表中,DMD5对应的文件名还有D1、D2,则保留消息摘要DMD5。
表12消息摘要列表1-6
Figure PCTCN2018119594-appb-000006
表13 DMD5对应的链接列表1-6
Figure PCTCN2018119594-appb-000007
本实施方式相对于现有技术而言,服务器中存储有消息摘要列表和与消息摘要列表中的各消息摘要一一对应的链接列表,提供了各存储文件的消息摘要和链接的存储方式;将存储文件的定位文件,消息摘要与链接进行了对应联系,便于对存储文件的链接进行快速查找和删除操作。提供了对文件删除的具体实现方式,增加了本实施方式的可行性,且在删除文件的过程中,对消息摘要等是否需要删除进行了判断,有效减少了无用数据对服务器空间的占用。
上面各种方法的步骤划分,只是为了描述清楚,实现时可以合并为一个步骤或者对某些步骤进行拆分,分解为多个步骤,只要包括相同的逻辑关系,都在本专利的保护范围内;对算法中或者流程中添加无关紧要的修改或者引入无关紧要的设计,但不改变其算法和流程的核心设计都在该专利的保护范围内。
本发明第四实施方式涉及一种服务器,如图4所示,包括至少一个处理器402;以及,与至少一个处理器402通信连接的存储器401;其中,存储器401存储有可被至少一个处理器402执行的指令,指令被至少一个处理器402执行,以使至少一个处理器402能够执行上述文件存储方法,或者,执行上述文件删除方法。
其中,存储器401和处理器402采用总线方式连接,总线可以包括任意数量的互联的总线和桥,总线将一个或多个处理器402和存储器401的各种电路连接在一起。总线还可以将诸如外围设备、稳压器和功率管理电路等之类的各种其他电路连接在一起,这些都是本领域所公知的,因此,本文不再对其进行进一步描述。总线接口在总线和收发机之间提供接口。收发机可以是一个元件,也可以是多个元件,比如多个接收器和发送器,提供用于在传输介质上与各种其他装置通信的单元。经处理器402处理的数据通过天线在无线介质上进行传输,进一步,天线还接收数据并将数据传送给处理器402。
处理器402负责管理总线和通常的处理,还可以提供各种功能,包括定时,外围接口,电压调节、电源管理以及其他控制功能。而存储器401可以被用于存储处理器402在执行操 作时所使用的数据。
值得一提的是,本实施方式中所涉及到的各模块均为逻辑模块,在实际应用中,一个逻辑单元可以是一个物理单元,也可以是一个物理单元的一部分,还可以以多个物理单元的组合实现。此外,为了突出本发明的创新部分,本实施方式中并没有将与解决本发明所提出的技术问题关系不太密切的单元引入,但这并不表明本实施方式中不存在其它的单元。
本发明第六实施方式涉及一种计算机可读存储介质,存储有计算机程序。计算机程序被处理器402执行时实现上述文件存储方法实施例,或实现上述文件删除方法实施例。
即,本领域技术人员可以理解,实现上述文件存储方法实施例或文件删除方法实施例中的全部或部分步骤是可以通过程序来指令相关的硬件来完成,该程序存储在一个存储介质中,包括若干指令用以使得一个设备(可以是单片机,芯片等)或处理器(processor)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。
本领域的普通技术人员可以理解,上述各实施方式是实现本发明的具体实施例,而在实际应用中,可以在形式上和细节上对其作各种改变,而不偏离本发明的精神和范围。

Claims (13)

  1. 一种文件存储方法,其特征在于,包括:
    接收待存储文件;
    检测在已存储的存储文件中,是否存在与所述待存储文件相同的存储文件;
    当存在与所述待存储文件相同的存储文件时,生成指向所述相同的存储文件的存储地址的路径,并将生成的所述路径作为所述待存储文件进行保存。
  2. 根据权利要求1所述的文件存储方法,其特征在于,所述检测在已存储的存储文件中,是否存在与所述待存储文件相同的存储文件,具体为:
    计算所述待存储文件的消息摘要;
    检测在已存储的存储文件中,是否存在与所述待存储文件具有相同消息摘要的存储文件;
    如不存在具有相同消息摘要的存储文件,则判定不存在与所述待存储文件相同的存储文件;
    若存在具有相同消息摘要的存储文件,则将所述待存储文件的内容与所述具有相同消息摘要的存储文件的内容进行比对,若比对结果相同,则判定存在与所述待存储文件相同的存储文件;若比对结果不相同,则判定不存在与所述待存储文件相同的存储文件。
  3. 根据权利要求2所述的文件存储方法,其特征在于,所述计算所述待存储文件的消息摘要,具体包括:
    当所述待存储文件的大小,小于预设门限时,直接计算所述待存储文件的消息摘要;
    当所述待存储文件的大小,大于或等于所述预设门限时,将所述待存储文件按预设大小进行划分,根据划分后的数据计算所述待存储文件的消息摘要。
  4. 根据权利要求2所述的文件存储方法,其特征在于,所述将所述待存储文件的内容与所述具有相同消息摘要的存储文件的内容进行比对,具体包括:
    比对所述待存储文件与所述具有相同消息摘要的存储文件的长度是否相同;
    若长度不同,则判定所述待存储文件的内容与所述具有相同消息摘要的存储文件的内容不同;
    若长度相同,则以二分查找法分别对所述待存储文件与所述具有相同消息摘要的存储文件进行划分,依次比对每一划分部分的内容是否相同,直至内容的比对结果不同或完成所有内容的比对。
  5. 根据权利要求1至4中任一项所述的文件存储方法,其特征在于,所述生成指向所述相同的存储文件的存储地址的路径,具体包括:
    生成所述待存储文件的软链接或快捷方式;
    将生成的所述软链接或所述快捷方式链接到与所述相同的存储文件的存储地址;
    所述将生成的所述路径作为所述待存储文件进行保存,具体为:保存所述待存储文件的软链接或快捷方式。
  6. 根据权利要求1至4中任一项所述的文件存储方法,其特征在于,还包括:
    当不存在与所述待存储文件相同的存储文件时,存储所述待存储文件,并生成所述待存储文件的定位文件,所述定位文件包括所述待存储文件的消息摘要和指向存储地址的路径;
    所述将生成的所述路径作为所述待存储文件进行保存后,还包括:
    生成所述待存储文件的定位文件,所述定位文件包括所述待存储文件的消息摘要和所述相同的存储文件的文件名。
  7. 一种文件删除方法,其特征在于,包括:
    接收文件的删除指令;
    若所述待删除文件为以路径方式存储的文件,则删除存储的所述路径。
  8. 根据权利要求7所述的文件删除方法,其特征在于,所述文件删除方法应用于服务器,所述服务器中存储有各存储文件的定位文件,所述定位文件用于存储所述存储文件的消息摘要,以及指向存储地址的路径或相同的存储文件的文件名;所述文件删除方法还包括:
    在所述接收文件的删除指令后,读取待删除文件的定位文件;
    判断所述待删除文件的定位文件中,是否存储有相同的存储文件的文件名;
    如果存储有相同的存储文件的文件名,则判定所述待删除文件为以路径方式存储的文件;
    删除所述待删除文件的定位文件。
  9. 根据权利要求8所述的文件删除方法,其特征在于,所述服务器中还存储有消息摘要列表,所述消息摘要列表用于存储消息摘要与各消息摘要对应的存储文件的文件名;所述服务器中还存储有与所述消息摘要列表中的各消息摘要一一对应的链接列表,所述链接列表用于存储至少一个存储文件的链接,所述链接包括所述存储文件链接到的源文件,以及指向所述源文件的存储地址的路径或存储地址;所述文件删除方法还包括:
    在所述删除所述待删除文件的定位文件后,根据所述待删除文件的消息摘要,在所述消息摘要列表中,删除所述待删除文件的消息摘要对应的所述待删除文件的文件名;
    根据所述待删除文件的消息摘要,获取与所述消息摘要对应的链接列表,在所述对应的链接列表中删除所述待删除文件的链接。
  10. 根据权利要求9所述的文件删除方法,其特征在于,在所述删除存储的所述路径后, 还包括:
    判断在所述消息摘要对应的链接列表中,是否还存储有与所述待删除文件链接到相同的源文件的链接;
    若未存储有与所述待删除文件链接到相同的源文件的链接,则释放所述待删除文件链接到的源文件所占用的存储空间。
  11. 根据权利要求10所述的文件删除方法,其特征在于,在所述释放所述待删除文件链接到的源文件所占用的存储空间后,还包括:
    判断在所述消息摘要列表中,是否还存储有与所述待删除文件具有相同消息摘要的文件的文件名;
    若否,在所述消息摘要列表中删除所述消息摘要。
  12. 一种服务器,其特征在于,包括:
    至少一个处理器;以及,
    与所述至少一个处理器通信连接的存储器;其中,
    所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行如权利要求1至6中任一项所述的文件存储方法,或者,执行如权利要求7至11中任一项所述的文件删除方法。
  13. 一种计算机可读存储介质,存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现权利要求1至6中任一项所述的文件存储方法,或者,执行如权利要求7至11中任一项所述的文件删除方法。
PCT/CN2018/119594 2018-11-08 2018-12-06 文件存储方法、删除方法、服务器及存储介质 WO2020093501A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US16/958,670 US20200349113A1 (en) 2018-11-08 2018-12-06 File storage method, deletion method, server and storage medium
EP18939218.6A EP3876106A4 (en) 2018-11-08 2018-12-06 File storage method and deletion method, server, and storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811323051.2A CN109582642A (zh) 2018-11-08 2018-11-08 文件存储方法、删除方法、服务器及存储介质
CN201811323051.2 2018-11-08

Publications (1)

Publication Number Publication Date
WO2020093501A1 true WO2020093501A1 (zh) 2020-05-14

Family

ID=65921816

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/119594 WO2020093501A1 (zh) 2018-11-08 2018-12-06 文件存储方法、删除方法、服务器及存储介质

Country Status (4)

Country Link
US (1) US20200349113A1 (zh)
EP (1) EP3876106A4 (zh)
CN (1) CN109582642A (zh)
WO (1) WO2020093501A1 (zh)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110535835A (zh) * 2019-08-09 2019-12-03 西藏宁算科技集团有限公司 一种基于消息摘要算法支持多云的共享云存储方法及系统
CN110825693A (zh) * 2019-10-25 2020-02-21 武汉联影医疗科技有限公司 医学数据存储方法、装置和可读存储介质
CN111159434A (zh) * 2019-12-29 2020-05-15 赵娜 一种在互联网存储集群中存储多媒体文件的方法及系统
CN111787070B (zh) * 2020-06-10 2022-07-12 俞力奇 一种设备端资源管理方法
CN112131194A (zh) * 2020-09-24 2020-12-25 上海摩勤智能技术有限公司 一种只读文件系统的文件存储控制方法及装置、存储介质
CN112817923B (zh) * 2021-02-20 2024-03-26 北京奇艺世纪科技有限公司 应用程序数据处理方法及装置
CN113051226A (zh) * 2021-06-02 2021-06-29 芯华章科技股份有限公司 系统级编译方法、电子设备及存储介质
CN113703886B (zh) * 2021-07-21 2023-06-20 青岛海尔科技有限公司 用户系统行为监控方法、系统、电子设备及存储介质

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102868765A (zh) * 2012-10-09 2013-01-09 乐视网信息技术(北京)股份有限公司 文件上传方法和系统
CN106294627A (zh) * 2016-07-28 2017-01-04 五八同城信息技术有限公司 数据管理方法及数据服务器
CN107577423A (zh) * 2017-08-15 2018-01-12 上海斐讯数据通信技术有限公司 一种优化存储空间的方法及系统

Family Cites Families (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6965903B1 (en) * 2002-05-07 2005-11-15 Oracle International Corporation Techniques for managing hierarchical data with link attributes in a relational database
EP1420349B1 (en) * 2002-11-14 2007-05-30 Alcatel Lucent Method and server for system synchronization
GB2406742B (en) * 2003-10-03 2006-03-22 3Com Corp Switching fabrics and control protocols for them
WO2007113533A1 (en) * 2006-03-31 2007-10-11 British Telecommunications Public Limited Company Xml-based transfer and a local storage of java objects
US7904450B2 (en) * 2008-04-25 2011-03-08 Wilson Kelce S Public electronic document dating list
US8239403B2 (en) * 2009-12-30 2012-08-07 International Business Machines Corporation Enhancing soft file system links
GB2483300A (en) * 2010-09-06 2012-03-07 Fonleap Ltd Transferring virtual machine state between host systems with common portions using a portable device
US20120278371A1 (en) * 2011-04-28 2012-11-01 Luis Montalvo Method for uploading a file in an on-line storage system and corresponding on-line storage system
CN103384256A (zh) * 2012-05-02 2013-11-06 天津书生投资有限公司 一种云存储方法及装置
US8468138B1 (en) * 2011-12-02 2013-06-18 International Business Machines Corporation Managing redundant immutable files using deduplication in storage clouds
US9235589B2 (en) * 2011-12-13 2016-01-12 International Business Machines Corporation Optimizing storage allocation in a virtual desktop environment
US20130290383A1 (en) * 2012-04-30 2013-10-31 Jain Nitin Mapping long names in a filesystem
US9747297B2 (en) * 2014-09-23 2017-08-29 Amazon Technologies, Inc. Synchronization of shared folders and files
US11036394B2 (en) * 2016-01-15 2021-06-15 Falconstor, Inc. Data deduplication cache comprising solid state drive storage and the like
US10740039B2 (en) * 2017-06-20 2020-08-11 Vmware, Inc. Supporting file system clones in any ordered key-value store

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102868765A (zh) * 2012-10-09 2013-01-09 乐视网信息技术(北京)股份有限公司 文件上传方法和系统
CN106294627A (zh) * 2016-07-28 2017-01-04 五八同城信息技术有限公司 数据管理方法及数据服务器
CN107577423A (zh) * 2017-08-15 2018-01-12 上海斐讯数据通信技术有限公司 一种优化存储空间的方法及系统

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3876106A4 *

Also Published As

Publication number Publication date
CN109582642A (zh) 2019-04-05
EP3876106A1 (en) 2021-09-08
EP3876106A4 (en) 2021-12-29
US20200349113A1 (en) 2020-11-05

Similar Documents

Publication Publication Date Title
WO2020093501A1 (zh) 文件存储方法、删除方法、服务器及存储介质
CN110018998B (zh) 一种文件管理方法、系统及电子设备和存储介质
US8112463B2 (en) File management method and storage system
US8285690B2 (en) Storage system for eliminating duplicated data
US20160179581A1 (en) Content-aware task assignment in distributed computing systems using de-duplicating cache
US20180285014A1 (en) Data storage method and apparatus
WO2021068567A1 (zh) 区块链的区块分发方法、装置、计算机设备和存储介质
US10891074B2 (en) Key-value storage device supporting snapshot function and operating method thereof
WO2017041654A1 (zh) 用于分布式存储系统的写入数据、获取数据的方法和设备
WO2022134128A1 (zh) 多版本数据存储方法、装置、计算机设备及存储介质
WO2017020576A1 (zh) 一种键值存储系统中文件压实的方法和装置
WO2019001521A1 (zh) 数据存储方法、存储设备、客户端及系统
WO2021139431A1 (zh) 微服务的数据同步方法、装置、电子设备及存储介质
CN110908589A (zh) 数据文件的处理方法、装置、系统和存储介质
WO2020119709A1 (zh) 数据合并的实现方法、装置、系统及存储介质
WO2023197404A1 (zh) 一种基于分布式数据库的对象存储方法及装置
US20170300255A1 (en) Method and Apparatus for Detecting Transaction Conflict and Computer System
CN115964002B (zh) 一种电能表终端档案管理方法、装置、设备及介质
US10073657B2 (en) Data processing apparatus, data processing method, and computer program product, and entry processing apparatus
US11256434B2 (en) Data de-duplication
CN110121874B (zh) 一种存储器数据替换方法、服务器节点和数据存储系统
EP3264254B1 (en) System and method for a simulation of a block storage system on an object storage system
WO2023071043A1 (zh) 文件聚合兼容方法、装置、计算机设备和存储介质
WO2022252322A1 (zh) 基于特征标记的电网监控系统内存库关系库同步方法
US20170090803A1 (en) Method and device for checking false sharing in data block deletion

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18939218

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2018939218

Country of ref document: EP

Effective date: 20210604