Summary of the invention
First aspect of the present invention is to provide a kind of processing method of data, comprising:
Receive the data backup requests message that client sends, described data backup requests message comprises: the finger print information of user ID and Backup Data;
According to the finger print information of described Backup Data, inquire about the file that described user ID is corresponding, judge whether to there are the data identical with described Backup Data; If judge not identical with described Backup Data data, then according to the finger print information of described Backup Data, inquire about the file that other user ID are corresponding, judge whether to there are the data identical with described Backup Data;
If judge to there are the data identical with described Backup Data in the file that other user ID described are corresponding, wherein, in the file that other history identifications described are corresponding, the data identical with described Backup Data are the second historical data, judge whether the quantity of the reference data that described second historical data is corresponding is less than random number corresponding to described second historical data; Wherein, described random number is more than or equal to predetermined threshold value;
If judge, the quantity of the reference data that described second historical data is corresponding is less than random number corresponding to described second historical data, then send backup messages to described client, and receive the described Backup Data that described client sends and the reference data generating described second historical data.
Another aspect of the present invention is to provide a kind for the treatment of facility of data, comprising:
Transceiver module, for receiving the data backup requests message that client sends, described data backup requests message comprises: the finger print information of user ID and Backup Data;
Judge module, for the finger print information according to described Backup Data, inquires about the file that described user ID is corresponding, judges whether to there are the data identical with described Backup Data; If judge not identical with described Backup Data data, then according to the finger print information of described Backup Data, inquire about the file that other user ID are corresponding, judge whether to there are the data identical with described Backup Data; If judge to there are the data identical with described Backup Data in the file that other user ID described are corresponding, wherein, in the file that other history identifications described are corresponding, the data identical with described Backup Data are the second historical data, judge whether the quantity of the reference data that described second historical data is corresponding is less than random number corresponding to described second historical data; Wherein, described random number is more than or equal to predetermined threshold value;
If for described judge module, described transceiver module also judges that the quantity of the reference data that described second historical data is corresponding is less than random number corresponding to described second historical data, then send backup messages to described client; And receive described Backup Data;
Reference data generation module, for generating the reference data of described second historical data.
Another aspect of the present invention is to provide a kind for the treatment of system of data, comprising: the treatment facility of client and data described above.
In the embodiment of the present invention, according to the finger print information of this Backup Data, inquire about in the data in file corresponding to this user ID and other user folders, whether identical data are stored, when obtaining there are identical data in the data in other user files if judge, judge whether the quantity of the reference data that this identical data is corresponding is less than random number corresponding to this identical data, identical data wherein in the file that other user ID are corresponding is the second historical data, can compare the quantity of the reference data of the second historical data and random number size, thus when making other users that the Backup Data with conjecture content is backuped to high in the clouds, even if high in the clouds saves the data of identical content, during owing to being less than random number corresponding to the second historical data in the quantity of reference data corresponding to the second historical data, client still will transmit this Backup Data to high in the clouds, therefore other users are made whether cannot to have backed up identical data in Test database, and then efficiently avoid the leaking data of user.
Embodiment
Fig. 1 is the flow chart of an embodiment of the processing method of data of the present invention, and as shown in Figure 1, the executive agent of the present embodiment is the treatment facility of data, and this equipment is arranged in cloud storage, then the method comprises:
Step 101, receive the data backup requests message that client sends, this data backup requests message comprises: user ID and finger print information corresponding to Backup Data;
Wherein, cloud storage can also be referred to as high in the clouds, this cloud storage is in cloud computing (cloud computing) conceptive extension and the new concept of development out one, refer to by functions such as cluster application, grid or distributed file systems, various dissimilar memory device a large amount of in network is gathered collaborative work by application software, a system of data storage and Operational Visit function is externally provided jointly.Finger print information can be Hash (HASH) value of Backup Data, and other also can be used numerical value of the unique feature of representative data can be used as the finger print information of these data.
In the present embodiment, client, in units of Backup Data, calculates the finger print information of this Backup Data, then the finger print information of this Backup Data and user ID is carried at the treatment facility sending to data in data backup requests message.
Step 102, finger print information according to this Backup Data, inquire about the file that this user ID is corresponding, judges whether to there are the data identical with this Backup Data; If judge not identical with this Backup Data data, then according to the finger print information of this Backup Data, inquire about the file that these other user ID are corresponding, judge whether to there are the data identical with this Backup Data.
If step 103 judges to there are the data identical with this Backup Data in the file that these other user ID are corresponding, wherein, data identical with this Backup Data in the file that these other user ID are corresponding are the second historical data, judge whether the quantity of the reference data that this second historical data is corresponding is less than random number corresponding to this second historical data; Wherein, this random number is more than or equal to predetermined threshold value.
If step 104 judges that the quantity of the reference data that this second historical data is corresponding is less than random number corresponding to this second historical data, then send backup messages to this client, and receive this Backup Data that this client sends and the reference data generating this second historical data
In the present embodiment, the size of this reference data is very little, and the content of this reference data is the sensing of the historical data to identical content, namely client is when reading these data, can according to the sensing of historical data with identical content, find the historical data that the content of this sensing correspondence is identical, and this historical data is read out.
In the present embodiment, random number corresponding to the historical data stored in the treatment facility of data is all stochastic generation, and the random number that namely any two historical datas are corresponding can be identical, also can not be identical.When random number is more than or equal to predetermined threshold value, then once judge to there are the data identical with this Backup Data in other user ID files, then arranging the data identical with this Backup Data is the second historical data, and judges whether the reference data of this second historical data is less than the random number of this second historical data.The technical problem mainly solved due to the present invention how to prevent the historical data content of secret not to be stolen, then generally, the quantity of the secret historical data stored is 1, the quantity of reference data is also generally 1, when the HASH value of Backup Data is identical with the HASH value of this confidential data, then illustrate that the content of this Backup Data is identical with the content of secret historical data, but when reference data is less than random number, prompting user is then still needed to preserve described Backup Data, therefore, user cannot know that high in the clouds stores the secret historical data identical with Backup Data.If user is again to the data that high in the clouds backup is identical with this Backup Data, even if in data storing procedure, user judges to have preserved identical data in database by data traffic, namely high in the clouds have employed the process of source data de-duplication, but because stored a Backup Data before, and random number is ignorant for this user, so whether or cannot determine high in the clouds, the end stores the secret historical data identical with this Backup Data.
In the present embodiment, receive the data backup requests message carrying the HASH value of user ID and Backup Data that client sends, and according to the finger print information of this Backup Data, inquire about in the data in file corresponding to this user ID and other user folders, whether identical data are stored, when obtaining there are identical data in the data in other user files if judge, judge whether the quantity of the reference data that this identical data is corresponding is less than random number corresponding to this identical data, identical data wherein in the file that other user ID are corresponding is the second historical data, can compare the quantity of the reference data of the second historical data and random number size, thus when making other users that the Backup Data with conjecture content is backuped to high in the clouds, even if high in the clouds saves the data of identical content, owing to being less than random number corresponding to the second historical data in the quantity of reference data corresponding to the second historical data, and random number is when being more than or equal to predetermined threshold value, client still will transmit this Backup Data to high in the clouds, therefore other users are made whether cannot to have backed up identical data in Test database, and then efficiently avoid the leaking data of user.
Fig. 2 is the flow chart of another embodiment of the processing method of data of the present invention, at the present embodiment, the executive agent of the method is the treatment facility of data, this equipment is arranged in cloud storage, and be HASH value with finger print data be example, introduce the technical scheme of the present embodiment in detail, then as shown in Figure 2, then the method comprises:
The data backup requests message that step 201, reception client send, this data backup requests message comprises: the HASH value of user ID and Backup Data.
In the present embodiment, client, in units of Backup Data, calculates the HASH value of this Backup Data, then the HASH value of this Backup Data and user ID are carried at the treatment facility sending to data in data backup requests message.
Step 202, HASH value according to this Backup Data, inquire about the file that this user ID is corresponding, judges whether to there are the data identical with this Backup Data, if do not exist, then and execution step 203; If exist, then perform step 207.
Step 203, HASH value according to this Backup Data, inquire about the file that other user ID are corresponding, judge whether to there are the data identical with this Backup Data, wherein, in the file that other user ID are corresponding, there are the data identical with this Backup Data is the second historical data; If exist, then perform step 204; If do not exist, then perform step 209.
Step 204, judge whether the quantity of the reference data that this second historical data is corresponding is less than random number corresponding to this second historical data; If be less than, then perform step 205; If be more than or equal to, then perform step 208.Wherein, this random number is more than or equal to predetermined threshold value.
In the present embodiment, random number corresponding to the historical data stored in the treatment facility of data is all stochastic generation, and the random number that namely any two historical datas are corresponding can be identical, also can not be identical.The setting of random number threshold, can arrange, such as, for confidential information the backup custom of confidential data according to statistical analysis user, usual user habit is in only preserving once, the quantity of the confidential information reference data of this user usually would not more than 2, and so, the threshold value of this random number just can be set to 2, like this, random number will be greater than the quantity of reference data, when other users preserve data beyond the clouds, would not occur the situation of source data de-duplication.If learnt after carrying out statistical analysis to the custom of user, confidential data is backed up portion by usual user habit again, and so, the quantity of the reference data of the confidential information of this user usually can not more than 3, and the threshold value of this random number just can be set to 3.
Step 205, transmission backup messages to client, and receive this Backup Data that this client sends and the reference data generating this second historical data.
Step 206, the quantity of the reference data of the second historical data is added 1.Terminate.
Step 207, generate the reference data of the first historical data, the quantity of the reference data of the first historical data is added 1, then sends backup success message to client.Terminate.
Step 208, generate reference data corresponding to the second historical data, and send backup success message to client, and perform step 206.
Step 209, transmission data backup requests acknowledge message to this client, and receive and preserve the Backup Data of this client transmission, the random number that this Backup Data of regeneration is corresponding.
In the present embodiment, for example, when user backs up a Backup Data first time, client is first in units of this Backup Data, calculate the HASH value of this Backup Data, and the HASH value of this Backup Data and user ID are carried at the treatment facility sending to data in data backup requests message, after the treatment facility of data receives this data backup requests message, according to the HASH value of this Backup Data, inquire about the file that this user ID is corresponding, judge whether to there are the data corresponding with this Backup Data, due to a Backup Data that this Backup Data is user's first time backup, therefore there are not the data identical with this Backup Data in this file, then according to the HASH value of this Backup Data, inquire about file corresponding to other user ID and whether there are the data identical with this Backup Data, if do not exist, send backup request message to this client, and receive and preserve this client and send this Backup Data, the random number that this Backup Data of regeneration is corresponding, wherein, the span of this random number can be [2, N], and wherein, N is integer.This N can be 10.In addition, when the treatment facility of data preserves this Backup Data, then the quantity of the reference data that this Backup Data is corresponding is 1.
If there are the data identical with this Backup Data in the file that these other user ID are corresponding, wherein, the data that this Backup Data is identical are the second historical data, and need the quantity judging the reference data that this second historical data is corresponding whether to be less than random number corresponding to this second historical data, if judge, the quantity of the reference data that this second historical data is corresponding is less than random number corresponding to this second historical data, then send backup messages to client, and receive this Backup Data, the reference data of this second historical data of regeneration, the quantity of reference data corresponding for the second historical data is added 1.
When user's second time backs up this Backup Data, client is first in units of this Backup Data, calculate the HASH value of this Backup Data, the HASH value of this Backup Data is identical with the HASH value that user backs up this Backup Data first time, and the HASH value of this Backup Data and user ID are carried at the treatment facility sending to data in data backup requests message, because this Backup Data is the backup of user's second time, the treatment facility of data judges to there are the data identical with this Backup Data in the file that this user ID is corresponding, then arrange in file corresponding to this user ID, the data identical with this Backup Data are the first historical data, and generate the reference data of this first historical data, the quantity of the application data of this first historical data is added 1, send backup success message again to client.
In the present embodiment, by receiving the data backup requests message carrying the HASH value of user ID and Backup Data that client sends, and according to the HASH value of this Backup Data, inquire about in the data in file corresponding to this user ID and other user folders, whether identical data are stored, when obtaining there are identical data in the data in other user files if judge, judge whether the quantity of the reference data that this identical data is corresponding is less than random number corresponding to this identical data, identical data wherein in the file that other user ID are corresponding is the second historical data, can compare the quantity of the reference data of the second historical data and random number size, thus when making other users that the Backup Data with conjecture content is backuped to high in the clouds, even if high in the clouds saves the data of identical content, owing to being less than random number corresponding to the second historical data in the quantity of reference data corresponding to the second historical data, and random number is when being more than or equal to predetermined threshold value, client still will transmit this Backup Data to high in the clouds, therefore other users are made whether cannot to have backed up identical data in Test database, and then efficiently avoid the leaking data of user.In addition, when the quantity of application data corresponding to the second historical data is more than or equal to random number corresponding to the second historical data, generates the reference data that the second historical data is corresponding, thus effectively improve client backup performance.
Fig. 3 is the structural representation of an embodiment of the treatment facility of data of the present invention, as shown in Figure 3, the equipment of the present embodiment comprises: transceiver module 11, judge module 12 and reference data generation module 13, wherein, the data backup requests message that transceiver module 11 sends for receiving client, this data backup requests message comprises: the finger print information of user ID and Backup Data; Judge module 12, for the finger print information according to this Backup Data, inquires about the file that this user ID is corresponding, judges whether to there are the data identical with this Backup Data; If judge not identical with this Backup Data data, then according to the finger print information of this Backup Data, inquire about the file that other user ID are corresponding, judge whether to there are the data identical with this Backup Data; If judge to there are the data identical with this Backup Data in the file that these other user ID are corresponding, wherein, data identical with this Backup Data in the file that these other history identifications are corresponding are the second historical data, judge whether the quantity of the reference data that this second historical data is corresponding is less than random number corresponding to this second historical data; Wherein, this random number is more than or equal to predetermined threshold value.If for this judge module 12, transceiver module 11 also judges that the quantity of the reference data that this second historical data is corresponding is less than random number corresponding to this second historical data, then send backup messages to this client; And receive this Backup Data; Reference data generation module 13 is for generating the reference data of this second historical data.
The treatment facility of the data of the present embodiment can perform the technical scheme of embodiment of the method shown in Fig. 1, and its principle is similar, repeats no more herein.
In the present embodiment, by receiving the data backup requests message carrying the finger print information of user ID and Backup Data that client sends, and according to the finger print information of this Backup Data, inquire about in the data in file corresponding to this user ID and other user folders, whether identical data are stored, when obtaining there are identical data in the data in other user files if judge, judge whether the quantity of the reference data that this identical data is corresponding is less than random number corresponding to this identical data, identical data wherein in the file that other user ID are corresponding is the second historical data, can compare the quantity of the reference data of the second historical data and random number size, thus when making other users that the Backup Data with conjecture content is backuped to high in the clouds, even if high in the clouds saves the data of identical content, owing to being less than random number corresponding to the second historical data in the quantity of reference data corresponding to the second historical data, and random number is when being more than or equal to predetermined threshold value, client still will transmit this Backup Data to high in the clouds, therefore make other users to detect and whether create source data de-duplication, and then efficiently avoid the leaking data of user.
Fig. 4 is the structural representation of another embodiment of the treatment facility of data of the present invention, as shown in Figure 4, on above-mentioned basis embodiment illustrated in fig. 3, if judge module 12 is also for judging to there are the data identical with this Backup Data in the file that this user ID is corresponding, wherein, in the file that this user ID is corresponding, there are the data identical with this Backup Data is the first historical data; Reference data generation module 13 is also for generating the reference data of this first historical data.Transceiver module 11 is also for sending backup success message to this client.
Further, if for judge module 12, reference data generation module 13 also judges that the quantity of the reference data that this second historical data is corresponding is more than or equal to random number corresponding to this second historical data, then the reference data that this second historical data is corresponding is generated; This transceiver module 11 is also for sending backup success message to this client.
Further, this equipment also comprises: reference data quantity logging modle 14, for the quantity of reference data corresponding for this first historical data is added 1; Or, also for the quantity of reference data corresponding for this second historical data is added 1.
Further, if transceiver module 11 also judges data not identical with this Backup Data in the file that these other user ID are corresponding for this judge module 12, then send backup messages to this client; And receive the Backup Data of this client transmission; Then this equipment also comprises: data memory module 15 and random number generation module 16, and wherein, data memory module 15 is for preserving described Backup Data; Random number generation module 16 is for generating random number corresponding to described Backup Data.
The treatment facility of the data of the present embodiment can perform the technical scheme of embodiment of the method shown in Fig. 2, and its principle is similar, repeats no more herein.
In the present embodiment, by receiving the data backup requests message carrying the finger print information of user ID and Backup Data that client sends, and according to the finger print information of this Backup Data, inquire about in the data in file corresponding to this user ID and other user folders, whether identical data are stored, when obtaining there are identical data in the data in other user files if judge, judge whether the quantity of the reference data that this identical data is corresponding is less than random number corresponding to this identical data, identical data wherein in the file that other user ID are corresponding is the second historical data, can compare the quantity of the reference data of the second historical data and random number size, thus when making other users that the Backup Data with conjecture content is backuped to high in the clouds, even if high in the clouds saves the data of identical content, owing to being less than random number corresponding to the second historical data in the quantity of reference data corresponding to the second historical data, and random number is when being more than or equal to predetermined threshold value, client still will transmit this Backup Data to high in the clouds, therefore make other users to detect and whether create source data de-duplication, and then efficiently avoid the leaking data of user.In addition, when the quantity of application data corresponding to the second historical data is more than or equal to random number corresponding to the second historical data, generates the reference data that the second historical data is corresponding, thus effectively improve client backup performance.
Fig. 5 is the structural representation of an embodiment of the treatment system of data of the present invention, as shown in Figure 5, this system comprises the treatment facility 22 of client 21 and data, wherein, the treatment facility 22 of data can be equipment shown in Fig. 3 or Fig. 4, and can perform the technical scheme of embodiment of the method shown in Fig. 1 or Fig. 2, its principle is similar, repeats no more herein.
One of ordinary skill in the art will appreciate that: all or part of step realizing above-mentioned each embodiment of the method can have been come by the hardware that program command is relevant.Aforesaid program can be stored in a computer read/write memory medium.This program, when performing, performs the step comprising above-mentioned each embodiment of the method; And aforesaid storage medium comprises: ROM, RAM, magnetic disc or CD etc. various can be program code stored medium.
Last it is noted that above each embodiment is only in order to illustrate technical scheme of the present invention, be not intended to limit; Although with reference to foregoing embodiments to invention has been detailed description, those of ordinary skill in the art is to be understood that: it still can be modified to the technical scheme described in foregoing embodiments, or carries out equivalent replacement to wherein some or all of technical characteristic; And these amendments or replacement, do not make the essence of appropriate technical solution depart from the scope of various embodiments of the present invention technical scheme.