CN102571709A - Method for uploading file, client, server and system - Google Patents

Method for uploading file, client, server and system Download PDF

Info

Publication number
CN102571709A
CN102571709A CN2010106067558A CN201010606755A CN102571709A CN 102571709 A CN102571709 A CN 102571709A CN 2010106067558 A CN2010106067558 A CN 2010106067558A CN 201010606755 A CN201010606755 A CN 201010606755A CN 102571709 A CN102571709 A CN 102571709A
Authority
CN
China
Prior art keywords
file
uploaded
client
server
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2010106067558A
Other languages
Chinese (zh)
Inventor
李星
徐盎
徐伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Beijing Co Ltd
Original Assignee
Tencent Technology Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Beijing Co Ltd filed Critical Tencent Technology Beijing Co Ltd
Priority to CN2010106067558A priority Critical patent/CN102571709A/en
Publication of CN102571709A publication Critical patent/CN102571709A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method for uploading a file, a client, a server and a system, belonging to the fields of file uploading. The method comprises the following steps of: receiving a file uploading request transmitted by the client; and judging whether a file identical with a message digest algorithm 5 (MD5) value of the file to be uploaded exists or not, if yes, taking the file identical with the MD5 value as the file to be uploaded, and otherwise judging whether a file of which a file name is identical with that of the file to be uploaded exists in the previously-uploaded file by the client or not, if yes, calculating a part of the file to be uploaded, which is different from the file of which the file name is identical with that of the file to be uploaded and enabling the client to upload the different part, and otherwise, enabling the client to upload the files to be uploaded at full amount. According to the method, the client, the server and the system, a file similar to the file to be uploaded is uploaded by the server and the different part of the file to be uploaded is uploaded to the server through a network, so that the network transmission quantity is greatly reduced and the waiting time of file uploading by the user is shortened.

Description

File uploading method, client, server and system
Technical Field
The present invention relates to the field of file uploading, and in particular, to a file uploading method, a client, a server, and a system.
Background
People in today's society often use various file uploading systems to upload files at work, study, and life, for example, a user uploads an attachment to an email, uploads a photo to an album, or uploads data to a network disk, etc., all of which require the use of a file uploading system.
Currently, when processing a file uploaded by a user, most file uploading systems firstly find whether an MD5 value identical to an MD5 value of a file to be uploaded exists in an MD5(Message Digest Algorithm, fifth edition) value of the file on their own server: if the file exists, the file is indicated to exist on the server, the file does not need to be uploaded, the existing file on the server is directly used, and a user is informed that the file is successfully uploaded; if the file does not exist, the server does not have the same file as the file to be uploaded, the file of the user is uploaded in full and stored on the server. The file uploading system can optimize aiming at the identical files (namely, the files with the same MD5 value) by adopting the mechanism, and when the user uploads the identical files, the network transmission amount and the user waiting time can be reduced.
In the process of implementing the invention, the inventor finds that the prior art has at least the following disadvantages:
in real life, a large number of files uploaded by users often have great similarity, for example, nearly ten filed graduates are modified, the content of each modification is little, for example, when a user uploads videos on a video website, a highlight part of the same video may be edited by different users in multiple versions, and the similarity between video segments of the different versions is also great. When a user uploads a file similar to a previously uploaded file to a server, the files are not completely identical and need to be uploaded in full quantity every time, so that the defects of large network transmission quantity and long user waiting time are caused.
Disclosure of Invention
In order to realize efficient uploading of files and reduce network transmission amount and waiting time when a user uploads the files, the embodiment of the invention provides a method, a client, a server and a system for uploading the files. The technical scheme is as follows:
in one aspect, a method for uploading a file is provided, and the method includes:
receiving a file uploading request sent by a client;
judging whether a file with the same value as the MD5 value of the fifth version of the message digest algorithm of the file to be uploaded exists, if so, using the file with the same value as the MD5 value as the file to be uploaded, otherwise,
judging whether a file with the same name as the file to be uploaded exists in the files uploaded by the client side before,
if so, calculating different parts of the file to be uploaded, which has the same file name as the file name, and enabling the client to upload the different parts, otherwise,
and enabling the client to upload the file to be uploaded in full.
The calculating different parts of the file with the same file name as the file to be uploaded specifically comprises:
and calculating different parts of the file to be uploaded, which has the same file name as the file name, by adopting a remote synchronous Rsync algorithm.
Correspondingly, the calculating of different parts of the file to be uploaded and the file with the same file name by adopting a remote synchronous Rsync algorithm specifically includes:
the file to be uploaded is divided into a group of non-overlapping data blocks with fixed sizes by the client, 32-bit rolling weak check and 128-bit message digest algorithm fourth version MD4 strong check are executed on each data block, and a weak check code and a strong check code of each data block are obtained;
and acquiring the weak check code and the strong check code of each data block, scanning the files with the same file names, and finding out the data blocks different from at least one of the weak check code and the strong check code of the file to be uploaded.
Further, after receiving the file uploading request sent by the client, the method further includes:
and checking whether the parameters and the safety of the file to be uploaded meet the requirements, if so, executing the step of judging whether the file with the value identical to that of the fifth version MD5 of the message digest algorithm of the file to be uploaded exists, and if not, returning error information to the client.
Further, before the determining whether there is a file with the same value as MD5 of the fifth version of message digest algorithm of the file to be uploaded, the method further includes:
and calculating the MD5 value of the fifth version of the message digest algorithm of the file to be uploaded.
In another aspect, a client for uploading a file is provided, where the client includes:
the sending module is used for sending a file uploading request to the server;
the partial uploading module is used for uploading different parts of the file with the same file name as the file to be uploaded to the server when the server judges that the file with the same file name as the file to be uploaded exists in the file uploaded by the client before after the sending module sends the file uploading request to the server;
and the full uploading module is used for uploading the file to be uploaded to the server in full when the server judges that the file with the same file name as the file to be uploaded does not exist in the files uploaded by the client before after the sending module sends the file uploading request to the server.
Further, the client further comprises:
the dividing module is used for dividing the file to be uploaded into a group of non-overlapping data blocks with fixed sizes before the part uploading module uploads different parts of the file with the same file name as the file to be uploaded to the server;
and the checking module is used for executing 32-bit rolling weak check and 128-bit message digest algorithm fourth version MD4 strong check on each data block divided by the dividing module to obtain a weak check code and a strong check code of each data block.
In another aspect, a server for uploading a file is provided, where the server includes:
the receiving module is used for receiving a file uploading request sent by a client;
the first judgment module is used for judging whether a file with the same value as the message digest algorithm fifth version MD5 of the file to be uploaded exists or not after the receiving module receives the file uploading request sent by the client;
a using module, configured to use, when the first determining module determines that there is a file with a value identical to that of MD5 of the file to be uploaded, the file with the value identical to that of MD5 as the file to be uploaded;
a second judging module, configured to, when the first judging module judges that there is no file with the same MD5 value as the file to be uploaded, judge whether a file with the same file name as the file to be uploaded exists in files uploaded by the client before;
the first calculation module is used for calculating different parts of the file to be uploaded, which has the same file name as the file to be uploaded, when the second judgment module judges that the file to be uploaded has the same file name as the file to be uploaded in the file uploaded by the client before;
a part uploading module for uploading the different parts calculated by the first calculating module to the client;
and the full uploading module is used for enabling the client to upload the files to be uploaded in full when the second judging module judges that the files with the same file names as the files to be uploaded do not exist in the files uploaded by the client before.
The first computing module is specifically configured to compute different parts of the file to be uploaded, which has the same file name as the file name, by using a remote synchronous Rsync algorithm.
Correspondingly, the first computing module specifically includes:
the acquisition unit is used for dividing the file to be uploaded into a group of non-overlapping data blocks with fixed sizes by the client, executing 32-bit rolling weak check and 128-bit message digest algorithm fourth version MD4 strong check on each data block, and acquiring the weak check code and the strong check code of each data block after obtaining the weak check code and the strong check code of each data block;
and the scanning unit is used for scanning the files with the same file names and finding out data blocks different from at least one of the weak check codes and the strong check codes of the files to be uploaded, which are acquired by the acquisition unit.
Further, the server further includes:
and the checking module is used for checking whether the parameters and the safety of the file to be uploaded meet requirements or not after the receiving module receives the file uploading request sent by the client, if so, executing the first judging module, and otherwise, returning error information to the client.
Still further, the server further comprises:
and the second calculating module is used for calculating the MD5 value of the file to be uploaded before the first judging module judges whether the file has the same value as the MD5 value of the fifth version of the message digest algorithm of the file to be uploaded.
In another aspect, a system for uploading a file is provided, the system comprising: a client and a server; wherein,
the client is the client;
the server is the server described above.
The technical scheme provided by the embodiment of the invention has the beneficial effects that:
by using the file similar to the file to be uploaded on the server, only different parts of the file to be uploaded are uploaded to the server through the network, so that the network transmission amount is greatly reduced, the waiting time for a user to upload the file is reduced, and meanwhile, the Rsync algorithm only needs to scan the file once, so that the time delay caused by calculating the file difference can be reduced.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a flowchart of a file uploading method according to an embodiment of the present invention;
fig. 2 is a flowchart of a file uploading method according to a second embodiment of the present invention;
fig. 3 is a schematic structural diagram of a client for uploading a file according to a third embodiment of the present invention;
fig. 4 is a schematic structural diagram of another file uploading client according to a third embodiment of the present invention;
fig. 5 is a schematic structural diagram of a first file uploading server according to a fourth embodiment of the present invention;
FIG. 6 is a schematic structural diagram of a first computing module according to a fourth embodiment of the present invention;
fig. 7 is a schematic structural diagram of a second file uploading server according to a fourth embodiment of the present invention;
fig. 8 is a schematic structural diagram of a third file upload server according to a fourth embodiment of the present invention;
fig. 9 is a schematic structural diagram of a file uploading system according to a fifth embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be described in detail with reference to the accompanying drawings.
Example one
The embodiment of the invention provides a file uploading method, and with reference to fig. 1, the method specifically comprises the following steps:
101: receiving a file uploading request sent by a client;
102: judging whether a file with the same value as the MD5 value of the file to be uploaded exists, if so, executing step 103, otherwise, executing step 104;
103: if the file with the same MD5 value as the file to be uploaded exists, using the file with the same MD5 value as the file to be uploaded;
104: if no file with the same value as the MD5 value of the file to be uploaded exists, determining whether a file with the same name as the file to be uploaded exists in the file uploaded by the client, if so, executing step 105, otherwise, executing step 106;
105: if the file with the same file name as the file to be uploaded exists, calculating different parts of the file with the same file name as the file to be uploaded, and enabling the client to upload the different parts;
106: and if the file with the same file name as the file to be uploaded does not exist, enabling the client to upload the file to be uploaded in full.
According to the method provided by the embodiment of the invention, the file similar to the file to be uploaded on the server is utilized, and only different parts of the file to be uploaded are uploaded to the server through the network, so that the network transmission amount is greatly reduced, the waiting time for uploading the file by a user is reduced, and meanwhile, the Rsync algorithm only needs to scan the file once, so that the time delay caused by calculating the file difference can be reduced.
Example two
The embodiment of the invention provides a file uploading method, and referring to fig. 2, the method specifically comprises the following steps:
201: receiving a file uploading request sent by a client;
specifically, the server receives a file uploading request sent by a user through a client, wherein the request comprises an address of a file to be uploaded by the user at the local of the user. For example, a user requests to upload an attachment in a mail, requests to upload a photo to an album, or requests to upload a video, etc., are transmitted to the server by the corresponding client.
After the client receives a file uploading request of a user and before the file uploading request is sent to the server, whether the format, the size, the specification, the safety and the like of a file to be uploaded meet requirements or not can be checked, if the format, the size, the specification, the safety and the like of the file to be uploaded meet the requirements, the file uploading request is sent to the server, the server carries out subsequent processing, and if at least one item does not meet the requirements, error information is returned to the user.
202: checking whether the parameters and the safety of the file to be uploaded meet the requirements, if so, executing a step 204, otherwise, executing a step 203;
the parameters of the file may include format, size, and specification of the file. The server finds the file to be uploaded through the local address in the file uploading request, and checks whether the parameters of the file to be uploaded meet the requirements and whether the parameters of the file to be uploaded are safe or not so as to determine whether the file uploading request is continuously processed or not. In the step, after the client checks the file to be uploaded, the server checks the file again to further ensure the security of the file to be uploaded.
203: if the request does not meet the requirement, returning error information to the client, and ending the process;
specifically, if the format of the file to be uploaded is incorrect, or the size and the specification are not qualified, or the file to be uploaded is unsafe, corresponding error information is returned to the client, and a user is reminded that the file transmission fails or the file to be uploaded is reselected.
It should be noted that, in the method provided in this embodiment, the steps 202 and 203 may also be omitted, that is, the step 204 is directly executed after the step 201 is executed, which is not specifically limited in this embodiment of the present invention.
204: if so, judging whether a file with the same MD5 value as the file to be uploaded exists, if so, executing the step 205, otherwise, executing the step 206;
for this step, if there is no MD5 value of the file to be uploaded in the file upload request, calculating MD5 value of the file to be uploaded before performing this step; and if the MD5 value of the file to be uploaded exists in the file uploading request, directly acquiring the MD5 value.
The server judges whether a file with the same MD5 value as the file to be uploaded exists or not, and is used for judging whether a file completely identical to the file to be uploaded exists or not. Since each file has a unique MD5 value, whether the file completely identical to the file to be uploaded exists on the server can be judged by searching whether the MD5 value identical to the MD5 value of the file to be uploaded exists in the MD5 values of the files on the server.
205: if the file with the same MD5 value as the file to be uploaded exists, the file with the same MD5 value is used as the file to be uploaded, and the process is ended;
specifically, the client does not need to upload the file to be uploaded, but directly uses the file with the same MD5 value on the server as the file to be uploaded, and notifies the client that the file is uploaded successfully.
206: if there is no file with the same MD5 value as the file to be uploaded, determining whether there is a file with the same file name as the file to be uploaded in the file uploaded by the client before, if so, executing step 207, otherwise, executing step 208;
specifically, the file name of the file to be uploaded is matched with the file name of the file uploaded by the client, and if the file with the same file name as the file to be uploaded exists, it is indicated that the file to be uploaded may be a similar file with the same file name as the file on the server.
It should be noted that, in this step, it is intended to determine whether a file similar to the file to be uploaded exists in the file previously uploaded by the client, and the embodiment of the present invention adopts a manner of determining whether a file with the same file name as the file to be uploaded exists. In practical application, if there is no file with the same file name, a file with a very similar file name may be used as the file similar to the file to be uploaded, which is not specifically limited in the embodiment of the present invention. When a file with a very similar file name is used as a file similar to a file to be uploaded, whether a file with a similarity to the file name of the file to be uploaded within a preset range exists is searched for, and if the file with the file name similarity meeting the condition exists, the file with the highest similarity in the files with the file name similarity meeting the condition is used as the file similar to the file to be uploaded, wherein the preset range can be set to be greater than or equal to 80%, or can be set to other reasonable values.
207: if the file with the same file name as the file to be uploaded exists, calculating different parts of the file with the same file name as the file to be uploaded, enabling the client to upload the different parts, and ending the process;
different parts of the file with the same file name as the file to be uploaded are calculated, and specifically, different parts of the file with the same file name as the file to be uploaded are calculated by adopting a remote synchronous Rsync algorithm.
Specifically, the process of calculating different parts of the file to be uploaded and the file with the same file name by adopting a remote synchronous Rsync algorithm comprises the following steps:
the client divides the file to be uploaded into a group of non-overlapping data blocks with fixed sizes, and executes 32-bit rolling weak check and 128-bit MD4(Message Digest Algorithm4, fourth edition) strong check on each data block to obtain a weak check code and a strong check code of each data block;
and acquiring the weak check code and the strong check code of each data block, scanning the file with the same file name, and finding out the data block different from at least one of the weak check code and the strong check code of the file to be uploaded.
The Rsync algorithm can calculate different data of a file to be uploaded and a similar file (i.e., a file having the same file name) in a short time. For convenience of explanation, assuming that the file to be uploaded is file a and the similar file on the server is file B, the application process of the Rsync algorithm may be as follows:
firstly, the client finds a file A through a local address, and divides the file A into a group of non-overlapping data blocks with fixed size of S bytes, and the last block may be smaller than S; two checks are then performed on each partitioned data block: and carrying out strong check on the MD4 of the 32-bit rolling weak check sum 128 bits to obtain a weak check code and a strong check code of each data block. The size of S is not specifically limited in the embodiment of the present invention, and may be set according to actual needs.
Then, the server obtains these check codes, and finds out data blocks different from at least one of the weak check code and the strong check code of the file a by scanning all data blocks of the file B with size S (the offset may be optional, and is not necessarily a multiple of S), and these different data blocks are different parts of the file a and the file B. This work can be quickly completed by the characteristics of rolling weak check, which is a fast Hash method, assuming that a continuous file is given, for example, from k to 1, the method for calculating the Hash value of the file is as follows:
<math> <mrow> <mi>a</mi> <mrow> <mo>(</mo> <mi>k</mi> <mo>,</mo> <mi>l</mi> <mo>)</mo> </mrow> <mo>=</mo> <mrow> <mo>(</mo> <munderover> <mi>&Sigma;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mi>k</mi> </mrow> <mi>l</mi> </munderover> <msub> <mi>X</mi> <mi>i</mi> </msub> <mo>)</mo> </mrow> <mi>mod</mi> <mi>M</mi> </mrow> </math>
<math> <mrow> <mi>b</mi> <mrow> <mo>(</mo> <mi>k</mi> <mo>,</mo> <mi>l</mi> <mo>)</mo> </mrow> <mo>=</mo> <mrow> <mo>(</mo> <munderover> <mi>&Sigma;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mi>k</mi> </mrow> <mi>l</mi> </munderover> <mrow> <mo>(</mo> <mi>l</mi> <mo>-</mo> <mi>i</mi> <mo>+</mo> <mn>1</mn> <mo>)</mo> </mrow> <msub> <mi>X</mi> <mi>i</mi> </msub> <mo>)</mo> </mrow> <mi>mod</mi> <mi>M</mi> </mrow> </math>
then the Hash value is s (k, l) ═ a (k, l) +216 xb (k, l)
Calculating the next continuous file k +1 to 1+1 can easily obtain a recurrence formula according to the calculation result of the last time:
a(k+1,l+1)=(a(k,l)-Xk+Xl+1)modM
b(k+1,l+1)=(b(k,l)-(l-k+1)Xk+a(k+1,l+1))modM
therefore, the rolling weak check value of each file can be obtained by recursion only by scanning the file B once.
And finally, after finding out the different data blocks of the file A and the file B, the server enables the client to upload the different data blocks according to the offset of the data blocks.
208: if the file with the same file name as the file to be uploaded does not exist, the client uploads the file to be uploaded in full, and the process is finished.
Specifically, the server enables the client to upload all the files to be uploaded to the server, and notifies the client that the file uploading is successful.
According to the method provided by the embodiment of the invention, the file similar to the file to be uploaded on the server is utilized, and only different parts of the file to be uploaded are uploaded to the server through the network, so that the network transmission amount is greatly reduced, the waiting time for uploading the file by a user is reduced, and meanwhile, the Rsync algorithm only needs to scan the file once, so that the time delay caused by calculating the file difference can be reduced.
EXAMPLE III
An embodiment of the present invention provides a file uploading client, configured to execute the method steps executed by the client in the first embodiment and the second embodiment, and referring to fig. 3, the client includes:
a sending module 301, configured to send a file upload request to a server;
a partial upload module 302, configured to, after the sending module 301 sends a file upload request to a server, upload, to the server, a different part of a file with a file name that is the same as that of a file to be uploaded when the server determines that the file with the file name that is the same as that of the file to be uploaded exists in a file previously uploaded by the client;
a full upload module 303, configured to upload the file to be uploaded to the server in full when the server determines that, after the sending module 301 sends the file upload request to the server, the file to be uploaded does not have a file with the same file name as the file to be uploaded in the file previously uploaded by the client.
Further, referring to fig. 4, the client further includes:
a dividing module 304, configured to divide the file to be uploaded into a set of non-overlapping data blocks with fixed sizes before the partial uploading module 302 uploads the different portions of the file with the same file name as the file to be uploaded to the server;
and the checking module 305 is configured to perform a 32-bit rolling weak check and a 128-bit MD4 strong check on each data block partitioned by the partitioning module 304 to obtain a weak check code and a strong check code of each data block.
According to the client provided by the embodiment of the invention, the file similar to the file to be uploaded on the server is utilized, and only different parts of the file to be uploaded are uploaded to the server through the network, so that the network transmission amount is greatly reduced, and the waiting time for the user to upload the file is reduced.
Example four
An embodiment of the present invention provides a file uploading server, configured to execute the method steps executed by the server in the first embodiment and the second embodiment, and referring to fig. 5, the server includes:
a receiving module 501, configured to receive a file upload request sent by a client;
a first determining module 502, configured to determine whether a file with a value equal to an MD5 value of a file to be uploaded exists after the receiving module 501 receives a file upload request sent by a client;
a using module 503, configured to use the file with the same MD5 value as the file to be uploaded when the first determining module 502 determines that the file has the same MD5 value as the file to be uploaded;
a second judging module 504, configured to, when the first judging module 502 judges that there is no file with the same MD5 value as the file to be uploaded, judge whether there is a file with the same file name as the file to be uploaded in the files uploaded by the client before;
a first calculating module 505, configured to calculate different portions of a file with a same file name as the file to be uploaded when the second determining module 504 determines that the file with the same file name as the file to be uploaded exists in the file previously uploaded by the client;
a part uploading module 506, configured to upload the different parts calculated by the first calculating module 505 to the client;
the total uploading module 507 is configured to, when the second determining module 504 determines that there is no file with the same file name as the file to be uploaded in the file previously uploaded by the client, enable the client to upload the file to be uploaded in total.
The first calculating module 505 is specifically configured to calculate, by using a remote synchronous Rsync algorithm, different portions of the file to be uploaded, which have the same file name as the file name.
Correspondingly, referring to fig. 6, the first calculating module 505 specifically includes:
an obtaining unit 505a, configured to divide the file to be uploaded into a group of non-overlapping data blocks with fixed sizes by the client, perform 32-bit rolling weak check and 128-bit MD4 strong check on each data block, obtain a weak check code and a strong check code of each data block, and obtain the weak check code and the strong check code of each data block;
the scanning unit 505b is configured to scan the file with the same file name, and find a data block that is different from at least one of the weak check code and the strong check code of the file to be uploaded, which are acquired by the acquiring unit 505 a.
Further, referring to fig. 7, the server further includes:
the checking module 508 is configured to check whether parameters and security of a file to be uploaded meet requirements after the receiving module 501 receives a file upload request sent by a client, if so, execute the first determining module 502, and otherwise, return an error message to the client.
Still further, referring to fig. 8, the server further includes:
a second calculating module 509, configured to calculate an MD5 value of the file to be uploaded before the first determining module 502 determines whether there is a file with the same MD5 value as the file to be uploaded.
According to the server provided by the embodiment of the invention, the file similar to the file to be uploaded on the server is utilized, and only different parts of the file to be uploaded are uploaded to the server by the client through the network, so that the network transmission amount is greatly reduced, the waiting time for uploading the file by a user is reduced, and meanwhile, the Rsync algorithm only needs to scan the file once, so that the time delay caused by calculating the file difference can be reduced.
EXAMPLE five
Referring to fig. 9, an embodiment of the present invention provides a system for uploading a file, where the system includes: a client 901 and a server 902; wherein,
the client 901 is the client provided in the third embodiment;
the server 902 is the server provided in the fourth embodiment described above.
In summary, in the embodiments of the present invention, the server is used to upload only different portions of the to-be-uploaded files to the server through the network by using the files similar to the to-be-uploaded files, so that the network transmission amount is greatly reduced, the waiting time for uploading the files by the user is reduced, and meanwhile, the Rsync algorithm only needs to scan the files once, so as to reduce the delay caused by calculating the file difference.
It should be noted that: in the above embodiment, when the client and the server upload the file, only the division of the functional modules is used for illustration, and in practical applications, the function distribution may be completed by different functional modules according to needs, that is, the internal structures of the client and the server are divided into different functional modules, so as to complete all or part of the functions described above. In addition, the client and the server for uploading the file and the method embodiment for uploading the file provided by the above embodiments belong to the same concept, and specific implementation processes thereof are detailed in the method embodiment and are not described herein again.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
All or part of the steps in the embodiments of the present invention may be implemented by software, and the corresponding software program may be stored in a readable storage medium, such as an optical disc or a hard disk.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (13)

1. A method for uploading a file, the method comprising:
receiving a file uploading request sent by a client;
judging whether a file with the same value as the MD5 value of the fifth version of the message digest algorithm of the file to be uploaded exists, if so, using the file with the same value as the MD5 value as the file to be uploaded, otherwise,
judging whether a file with the same name as the file to be uploaded exists in the files uploaded by the client side before,
if so, calculating different parts of the file to be uploaded, which has the same file name as the file name, and enabling the client to upload the different parts, otherwise,
and enabling the client to upload the file to be uploaded in full.
2. The method according to claim 1, wherein the calculating different portions of the file to be uploaded that have the same file name as the file name specifically comprises:
and calculating different parts of the file to be uploaded, which has the same file name as the file name, by adopting a remote synchronous Rsync algorithm.
3. The method according to claim 2, wherein the calculating different parts of the file to be uploaded and the file with the same file name by using a remote synchronous Rsync algorithm specifically comprises:
the file to be uploaded is divided into a group of non-overlapping data blocks with fixed sizes by the client, 32-bit rolling weak check and 128-bit message digest algorithm fourth version MD4 strong check are executed on each data block, and a weak check code and a strong check code of each data block are obtained;
and acquiring the weak check code and the strong check code of each data block, scanning the files with the same file names, and finding out the data blocks different from at least one of the weak check code and the strong check code of the file to be uploaded.
4. The method according to any one of claims 1 to 3, wherein after receiving the file upload request sent by the client, the method further comprises:
and checking whether the parameters and the safety of the file to be uploaded meet the requirements, if so, executing the step of judging whether the file with the value identical to that of the fifth version MD5 of the message digest algorithm of the file to be uploaded exists, and if not, returning error information to the client.
5. The method according to any one of claims 1 to 3, wherein before the determining whether the file with the same value as the message digest algorithm version five MD5 of the file to be uploaded exists, the method further comprises:
and calculating the MD5 value of the fifth version of the message digest algorithm of the file to be uploaded.
6. A client for uploading a file, the client comprising:
the sending module is used for sending a file uploading request to the server;
the partial uploading module is used for uploading different parts of the file with the same file name as the file to be uploaded to the server when the server judges that the file with the same file name as the file to be uploaded exists in the file uploaded by the client before after the sending module sends the file uploading request to the server;
and the full uploading module is used for uploading the file to be uploaded to the server in full when the server judges that the file with the same file name as the file to be uploaded does not exist in the files uploaded by the client before after the sending module sends the file uploading request to the server.
7. The client of claim 6, further comprising:
the dividing module is used for dividing the file to be uploaded into a group of non-overlapping data blocks with fixed sizes before the part uploading module uploads different parts of the file with the same file name as the file to be uploaded to the server;
and the checking module is used for executing 32-bit rolling weak check and 128-bit message digest algorithm fourth version MD4 strong check on each data block divided by the dividing module to obtain a weak check code and a strong check code of each data block.
8. A server for uploading files, the server comprising:
the receiving module is used for receiving a file uploading request sent by a client;
the first judgment module is used for judging whether a file with the same value as the message digest algorithm fifth version MD5 of the file to be uploaded exists or not after the receiving module receives the file uploading request sent by the client;
a using module, configured to use, when the first determining module determines that there is a file with a value identical to that of MD5 of the file to be uploaded, the file with the value identical to that of MD5 as the file to be uploaded;
a second judging module, configured to, when the first judging module judges that there is no file with the same MD5 value as the file to be uploaded, judge whether a file with the same file name as the file to be uploaded exists in files uploaded by the client before;
the first calculation module is used for calculating different parts of the file to be uploaded, which has the same file name as the file to be uploaded, when the second judgment module judges that the file to be uploaded has the same file name as the file to be uploaded in the file uploaded by the client before;
a part uploading module for uploading the different parts calculated by the first calculating module to the client;
and the full uploading module is used for enabling the client to upload the files to be uploaded in full when the second judging module judges that the files with the same file names as the files to be uploaded do not exist in the files uploaded by the client before.
9. The server according to claim 8, wherein the first computing module is specifically configured to compute different portions of the file to be uploaded that have the same file name as the file name by using a remote synchronous Rsync algorithm.
10. The server according to claim 9, wherein the first computing module specifically includes:
the acquisition unit is used for dividing the file to be uploaded into a group of non-overlapping data blocks with fixed sizes by the client, executing 32-bit rolling weak check and 128-bit message digest algorithm fourth version MD4 strong check on each data block, and acquiring the weak check code and the strong check code of each data block after obtaining the weak check code and the strong check code of each data block;
and the scanning unit is used for scanning the files with the same file names and finding out data blocks different from at least one of the weak check codes and the strong check codes of the files to be uploaded, which are acquired by the acquisition unit.
11. A server according to any of claims 8-10, wherein the server further comprises:
and the checking module is used for checking whether the parameters and the safety of the file to be uploaded meet requirements or not after the receiving module receives the file uploading request sent by the client, if so, executing the first judging module, and otherwise, returning error information to the client.
12. A server according to any of claims 8-10, wherein the server further comprises:
and the second calculating module is used for calculating the MD5 value of the file to be uploaded before the first judging module judges whether the file has the same value as the MD5 value of the fifth version of the message digest algorithm of the file to be uploaded.
13. A system for uploading a file, the system comprising: a client and a server; wherein,
the client is the client according to any one of the claims 6-7;
the server according to any of the preceding claims 8-12.
CN2010106067558A 2010-12-16 2010-12-16 Method for uploading file, client, server and system Pending CN102571709A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2010106067558A CN102571709A (en) 2010-12-16 2010-12-16 Method for uploading file, client, server and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2010106067558A CN102571709A (en) 2010-12-16 2010-12-16 Method for uploading file, client, server and system

Publications (1)

Publication Number Publication Date
CN102571709A true CN102571709A (en) 2012-07-11

Family

ID=46416197

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2010106067558A Pending CN102571709A (en) 2010-12-16 2010-12-16 Method for uploading file, client, server and system

Country Status (1)

Country Link
CN (1) CN102571709A (en)

Cited By (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103002029A (en) * 2012-11-26 2013-03-27 北京百度网讯科技有限公司 Management method, system and client for uploaded files
CN103167159A (en) * 2012-09-25 2013-06-19 深圳市金立通信设备有限公司 Method of rapidly looking up for identical file contents in mobile phone
CN103401914A (en) * 2013-07-26 2013-11-20 浪潮电子信息产业股份有限公司 File uploading broken-point continuously transferring method
CN103795783A (en) * 2014-01-14 2014-05-14 上海上讯信息技术股份有限公司 Data synchronization method and system
CN103873522A (en) * 2012-12-14 2014-06-18 联想(北京)有限公司 Electronic equipment, and file partitioning method applied to same
CN104424225A (en) * 2013-08-26 2015-03-18 联想(北京)有限公司 File processing method and device based on file transfer process
CN104469069A (en) * 2014-12-15 2015-03-25 北京百度网讯科技有限公司 Picture synchronization method and device
CN104467941A (en) * 2014-11-04 2015-03-25 北京世纪东方国铁科技股份有限公司 Station relay and data transmitting method thereof
CN104462422A (en) * 2014-12-15 2015-03-25 北京百度网讯科技有限公司 Object processing method and device
CN104639606A (en) * 2014-12-29 2015-05-20 曙光信息产业(北京)有限公司 Optimization method for differentiated contrast of blocks
CN104811394A (en) * 2015-04-21 2015-07-29 深圳市出众网络有限公司 Method and system for saving traffic for accessing server
CN105007333A (en) * 2015-08-12 2015-10-28 阔地教育科技有限公司 Managing method and system for file transmitting
CN105100274A (en) * 2015-08-31 2015-11-25 北京奇虎科技有限公司 File uploading/downloading method and system in web environment, client and server
CN105208108A (en) * 2015-08-31 2015-12-30 北京奇虎科技有限公司 File uploading/downloading method and system in Web environment, server and client end
CN105635324A (en) * 2016-03-17 2016-06-01 新浪网技术(中国)有限公司 Big file uploading and continuous uploading method and device for browser or server
CN106202456A (en) * 2016-07-13 2016-12-07 广东欧珀移动通信有限公司 Send the method and device of picture
CN106341480A (en) * 2016-09-20 2017-01-18 北京奇虎科技有限公司 Method and device for uploading data packet
CN106446138A (en) * 2016-09-20 2017-02-22 北京奇虎科技有限公司 Storage method and device of data packet
CN106487795A (en) * 2016-10-31 2017-03-08 努比亚技术有限公司 A kind of device and method of adnexa upload, server
CN106817391A (en) * 2015-12-01 2017-06-09 百度在线网络技术(北京)有限公司 Document breakpoint transmission method and apparatus
CN107707599A (en) * 2017-05-26 2018-02-16 语祯物联科技(上海)有限公司 A kind of method and device of Internet of Things communication equipment transmission file
CN107770273A (en) * 2017-10-23 2018-03-06 上海斐讯数据通信技术有限公司 A kind of big file cloud synchronous method and system
CN108449607A (en) * 2018-01-18 2018-08-24 上海宝信软件股份有限公司 File compliance inspection method and system
CN109257405A (en) * 2017-07-14 2019-01-22 中兴通讯股份有限公司 Processing method, device and the server that file uploads
CN109542988A (en) * 2018-10-19 2019-03-29 深圳点猫科技有限公司 A kind of update method and electronic equipment of big data
CN110300151A (en) * 2019-05-22 2019-10-01 深圳壹账通智能科技有限公司 Method for uploading data file and system
CN111083145A (en) * 2019-12-18 2020-04-28 北京华宇信息技术有限公司 Message sending method and device and electronic equipment
CN112738249A (en) * 2020-12-30 2021-04-30 平安证券股份有限公司 File uploading method, device, equipment and storage medium based on quantitative transaction
CN113014476A (en) * 2021-03-17 2021-06-22 维沃移动通信有限公司 Group creation method and device
CN113542422A (en) * 2021-07-19 2021-10-22 星辰天合(北京)数据科技有限公司 Data storage method, data storage device, storage medium and electronic device
CN115361377A (en) * 2022-08-19 2022-11-18 中国联合网络通信集团有限公司 File uploading method, user terminal, network disk server, equipment and medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101174954A (en) * 2006-10-31 2008-05-07 上海高勤通信科技有限公司 Document breaking point uploading method based on internet hypertext transfer protocol
CN101552669A (en) * 2008-04-02 2009-10-07 林兆祥 Method and system of data transmission
CN101699822A (en) * 2009-08-06 2010-04-28 腾讯科技(深圳)有限公司 File uploading method and device, and mass storage system
CN101788976A (en) * 2010-02-10 2010-07-28 北京播思软件技术有限公司 File splitting method based on contents

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101174954A (en) * 2006-10-31 2008-05-07 上海高勤通信科技有限公司 Document breaking point uploading method based on internet hypertext transfer protocol
CN101552669A (en) * 2008-04-02 2009-10-07 林兆祥 Method and system of data transmission
CN101699822A (en) * 2009-08-06 2010-04-28 腾讯科技(深圳)有限公司 File uploading method and device, and mass storage system
CN101788976A (en) * 2010-02-10 2010-07-28 北京播思软件技术有限公司 File splitting method based on contents

Cited By (43)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103167159A (en) * 2012-09-25 2013-06-19 深圳市金立通信设备有限公司 Method of rapidly looking up for identical file contents in mobile phone
CN103167159B (en) * 2012-09-25 2015-02-11 深圳市金立通信设备有限公司 Method of rapidly looking up for identical file contents in mobile phone
CN103002029B (en) * 2012-11-26 2016-12-21 北京百度网讯科技有限公司 The management method of upper transmitting file, system and client
CN103002029A (en) * 2012-11-26 2013-03-27 北京百度网讯科技有限公司 Management method, system and client for uploaded files
CN103873522A (en) * 2012-12-14 2014-06-18 联想(北京)有限公司 Electronic equipment, and file partitioning method applied to same
CN103401914A (en) * 2013-07-26 2013-11-20 浪潮电子信息产业股份有限公司 File uploading broken-point continuously transferring method
CN104424225A (en) * 2013-08-26 2015-03-18 联想(北京)有限公司 File processing method and device based on file transfer process
CN103795783A (en) * 2014-01-14 2014-05-14 上海上讯信息技术股份有限公司 Data synchronization method and system
CN104467941A (en) * 2014-11-04 2015-03-25 北京世纪东方国铁科技股份有限公司 Station relay and data transmitting method thereof
CN104467941B (en) * 2014-11-04 2018-09-25 北京世纪东方通讯设备有限公司 Station repeater and its data transmission method
CN104462422A (en) * 2014-12-15 2015-03-25 北京百度网讯科技有限公司 Object processing method and device
CN104469069B (en) * 2014-12-15 2019-07-30 北京百度网讯科技有限公司 Photo synchronous method and device
CN104469069A (en) * 2014-12-15 2015-03-25 北京百度网讯科技有限公司 Picture synchronization method and device
CN104639606A (en) * 2014-12-29 2015-05-20 曙光信息产业(北京)有限公司 Optimization method for differentiated contrast of blocks
CN104639606B (en) * 2014-12-29 2018-03-16 曙光信息产业(北京)有限公司 A kind of optimization method of differentiation contrast piecemeal
CN104811394A (en) * 2015-04-21 2015-07-29 深圳市出众网络有限公司 Method and system for saving traffic for accessing server
CN105007333A (en) * 2015-08-12 2015-10-28 阔地教育科技有限公司 Managing method and system for file transmitting
CN105208108A (en) * 2015-08-31 2015-12-30 北京奇虎科技有限公司 File uploading/downloading method and system in Web environment, server and client end
CN105100274A (en) * 2015-08-31 2015-11-25 北京奇虎科技有限公司 File uploading/downloading method and system in web environment, client and server
CN106817391A (en) * 2015-12-01 2017-06-09 百度在线网络技术(北京)有限公司 Document breakpoint transmission method and apparatus
CN105635324A (en) * 2016-03-17 2016-06-01 新浪网技术(中国)有限公司 Big file uploading and continuous uploading method and device for browser or server
CN106202456B (en) * 2016-07-13 2019-08-09 Oppo广东移动通信有限公司 Send the method and device of picture
CN106202456A (en) * 2016-07-13 2016-12-07 广东欧珀移动通信有限公司 Send the method and device of picture
CN106341480A (en) * 2016-09-20 2017-01-18 北京奇虎科技有限公司 Method and device for uploading data packet
CN106446138B (en) * 2016-09-20 2020-11-20 北京奇虎科技有限公司 Data packet storage method and device
CN106341480B (en) * 2016-09-20 2019-12-20 北京奇虎科技有限公司 Data packet uploading method and device
CN106446138A (en) * 2016-09-20 2017-02-22 北京奇虎科技有限公司 Storage method and device of data packet
CN106487795A (en) * 2016-10-31 2017-03-08 努比亚技术有限公司 A kind of device and method of adnexa upload, server
CN107707599A (en) * 2017-05-26 2018-02-16 语祯物联科技(上海)有限公司 A kind of method and device of Internet of Things communication equipment transmission file
CN109257405A (en) * 2017-07-14 2019-01-22 中兴通讯股份有限公司 Processing method, device and the server that file uploads
CN107770273A (en) * 2017-10-23 2018-03-06 上海斐讯数据通信技术有限公司 A kind of big file cloud synchronous method and system
CN108449607A (en) * 2018-01-18 2018-08-24 上海宝信软件股份有限公司 File compliance inspection method and system
CN108449607B (en) * 2018-01-18 2020-06-12 上海宝信软件股份有限公司 File compliance checking method and system
CN109542988A (en) * 2018-10-19 2019-03-29 深圳点猫科技有限公司 A kind of update method and electronic equipment of big data
CN110300151B (en) * 2019-05-22 2022-02-11 深圳壹账通智能科技有限公司 Data file uploading method and system
CN110300151A (en) * 2019-05-22 2019-10-01 深圳壹账通智能科技有限公司 Method for uploading data file and system
CN111083145A (en) * 2019-12-18 2020-04-28 北京华宇信息技术有限公司 Message sending method and device and electronic equipment
CN112738249A (en) * 2020-12-30 2021-04-30 平安证券股份有限公司 File uploading method, device, equipment and storage medium based on quantitative transaction
CN112738249B (en) * 2020-12-30 2023-11-21 平安证券股份有限公司 File uploading method, device, equipment and storage medium based on quantitative transaction
CN113014476A (en) * 2021-03-17 2021-06-22 维沃移动通信有限公司 Group creation method and device
CN113542422A (en) * 2021-07-19 2021-10-22 星辰天合(北京)数据科技有限公司 Data storage method, data storage device, storage medium and electronic device
CN113542422B (en) * 2021-07-19 2023-10-17 北京星辰天合科技股份有限公司 Data storage method and device, storage medium and electronic device
CN115361377A (en) * 2022-08-19 2022-11-18 中国联合网络通信集团有限公司 File uploading method, user terminal, network disk server, equipment and medium

Similar Documents

Publication Publication Date Title
CN102571709A (en) Method for uploading file, client, server and system
CN106933854B (en) Short link processing method and device and server
US8452106B2 (en) Partition min-hash for partial-duplicate image determination
US20230342403A1 (en) Method and system for document similarity analysis
EP2626819A1 (en) Method and system for documentation of digital archives
CN110413595B (en) Data migration method applied to distributed database and related device
US11190576B2 (en) File distribution and download method, distribution server, client terminal and system
CN103119551B (en) The recovery optimized
CN106649360B (en) Data repeatability checking method and device
US20210191911A1 (en) Systems and methods for sketch computation
CN111970357A (en) Video uploading method, device and system
US9069681B1 (en) Real-time log joining on a continuous stream of events that are approximately ordered
CN114564446A (en) File storage method, device, system and storage medium
CN104809256A (en) Data deduplication method and data deduplication method
CN113672616B (en) Data indexing method, device, terminal and storage medium
CN112035405A (en) Document transcoding method and device, scheduling server and storage medium
CN113300875B (en) Method, server, system and storage medium for checking back source data
Du et al. Deduplicated disk image evidence acquisition and forensically-sound reconstruction
CN106658034A (en) File storage and reading method and device
US20110289194A1 (en) Cloud data storage system
CN111427917A (en) Search data processing method and related product
CN116069725A (en) File migration method, device, apparatus, medium and program product
CN109977295A (en) A kind of black and white lists matching process and device
CN112799872B (en) Erasure code encoding method and device based on key value pair storage system
CN115801765A (en) File transmission method, device, system, electronic equipment and storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C12 Rejection of a patent application after its publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20120711