CN105791434A - Distributed data processing method and data center - Google Patents

Distributed data processing method and data center Download PDF

Info

Publication number
CN105791434A
CN105791434A CN201610271917.4A CN201610271917A CN105791434A CN 105791434 A CN105791434 A CN 105791434A CN 201610271917 A CN201610271917 A CN 201610271917A CN 105791434 A CN105791434 A CN 105791434A
Authority
CN
China
Prior art keywords
data
slice
random number
distributed
back end
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610271917.4A
Other languages
Chinese (zh)
Inventor
张锐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Longrise Technology Co Ltd
Original Assignee
Shenzhen Longrise Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Longrise Technology Co Ltd filed Critical Shenzhen Longrise Technology Co Ltd
Priority to CN201610271917.4A priority Critical patent/CN105791434A/en
Publication of CN105791434A publication Critical patent/CN105791434A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1095Replication or mirroring of data, e.g. scheduling or transport for data synchronisation between network nodes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/04Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks
    • H04L63/0428Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks wherein the data content is protected, e.g. by encrypting or encapsulating the payload
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The invention discloses a distributed data processing method and a data center. The distributed data processing method comprises the steps of when the data center receives original data, performing slicing processing on the original data for obtaining a plurality of slices of data; generating a random number by the data center based on a preset random number generator, and utilizing the random number as a current random number; transmitting the current random number to each data node in a distributed data system by the data center; encrypting each slice of data by the data center according to the current random number, a prestored key and a preset encryption algorithm, thereby obtaining cryptograph slice data; and distributing the cryptograph slice data to corresponding data nodes for storage by the data center according to the current random number. The distributed data processing method and the data center performs functions of preventing random interception and cracking of the cryptograph slice data by a no-authority party, and ensuring high information safety of the slice data of the distributed data system.

Description

Distributed data processing method and data center
Technical field
The present invention relates to distributed data system technical field, particularly relate to a kind of distributed data processing method and data center.
Background technology
Carry out in a deep going way along with informationalized, data produced by the operation system of all trades and professions get more and more, due to traditional centralized data system scheme, (centralized data system refers to and formulates a specific accessing zone into data to be stored, all data are all accessed in this accessing zone) all of data are all existed data center's main frame extracts data for concentrating, if data center's main frame is out of order, then whole system cannot store and extract data, namely centralized data system scheme is compared to distributed data system scheme, data stability is relatively low, so distributed data system arises at the historic moment.
Existing distributed system for the storage general flow of big data is: initial data is carried out slicing treatment by the heart in the data, after the section of initial data is completed, slice of data after section is synchronized to other each back end of distributed data system (namely relative to other of data center from data center), but, owing to distributed data system has higher opening, slice of data is synchronized to the process of back end from data center, slice of data is easily trapped or cracks, lack of competence side can extract the reduction of multiple slice of data and obtain the initial data of correspondence, thus causing the information security issue of distributed data system slice of data.
Summary of the invention
Present invention is primarily targeted at a kind of distributed data processing method of offer and data center, it is intended to solve the slice of data information security issue of existing distributed data system.
For achieving the above object, a kind of distributed data processing method provided by the invention, described distributed data processing method includes:
When the heart receives initial data in the data, described initial data is carried out data slicer and processes to obtain multiple slice of data;
Data center generates a random number based on default randomizer, using this random number as current random number;
Current random number is sent to each back end in distributed data system by data center;
Each slice of data, according to current random number, prestored secret key and preset AES, is encrypted by data center, to obtain ciphertext slice of data;
Described ciphertext slice of data, according to current random number, is distributed to the back end storage of correspondence by data center.
Preferably, described when the heart receives initial data in the data, described initial data is carried out the step that data slicer processes to obtain multiple slice of data and includes:
When the heart receives initial data in the data, data center resolves the initial data obtained, to obtain the data parameters of this initial data;
The task that initial data is cut into slices, according to described data parameters and data centrality energy, is divided into many height section task by data center;
The outer tune section task of preset ratio in described sub-section task is distributed to the back end in distributed data system outside data center by data center;
Data center performs unappropriated in described sub-section task to reserve sub-section task for one's own use, preserve the first corresponding slice of data, and obtain the second slice of data that back end execution described outer tune section task obtains, wherein slice of data includes the first slice of data and the second slice of data.
Preferably, described data center performs unappropriated in described sub-section task to reserve sub-section task for one's own use, preserves the first corresponding slice of data, and also includes after obtaining the step that back end performs the second slice of data that described outer tune section task obtains:
Data center announces the first slice of data and the second slice of data in real time, for the client-access of distributed data system.
Preferably, described data center announces the first slice of data and the second slice of data in real time, also includes for after the step of the client-access of distributed data system:
When receiving the request of data that client sends, data center is according to described first slice of data and described current random number, and deciphering reduction obtains the target data that described request of data is corresponding;
If reduction obtains the failure of described target data, then data center extracts the second slice of data, and according to the first slice of data, the second slice of data and described current random number, deciphering reduction obtains described target data.
Preferably, described data center is according to current random number, and the step of the back end storage that described ciphertext slice of data is distributed to correspondence includes:
Described current random number is mated by data center with the storage identification code of each back end, and wherein storage identification code is the coding of unique identification data node, and current random number is identical with storage identification code form;
Described ciphertext slice of data is distributed to current random number matching degree more than in the back end corresponding to the storage identification code of preset matching degree by data center.
The present invention also provides for a kind of data center, and described data center includes:
Section module, for, when the heart receives initial data in the data, carrying out data slicer and process to obtain multiple slice of data to described initial data;
Random number module, for generating a random number based on default randomizer, using this random number as current random number;
First sending module, for sending current random number to each back end in distributed data system;
Encrypting module, for according to current random number, prestored secret key and preset AES, being encrypted each slice of data, to obtain ciphertext slice of data;
Second sending module, for according to current random number, being distributed to the back end storage of correspondence by described ciphertext slice of data.
Preferably, described section module includes:
Resolution unit, for, when the heart receives initial data in the data, resolving the initial data obtained, to obtain the data parameters of this initial data;
Task division unit, for according to described data parameters and data centrality energy, being divided into many height section task by the task that initial data is cut into slices;
Task allocation unit, for distributing to the back end in distributed data system outside data center by the outer tune section task of preset ratio in described sub-section task;
Task executing units, for performing unappropriated in described sub-section task to reserve sub-section task for one's own use, preserve the first corresponding slice of data, and obtain the second slice of data that back end execution described outer tune section task obtains, wherein slice of data includes the first slice of data and the second slice of data.
Preferably, described data center also includes:
Announce module, announce the first slice of data and the second slice of data in real time for data center, for the client-access of distributed data system.
Preferably, described data center also includes:
First recovery module, for when receiving the request of data that client sends, according to described first slice of data and described current random number, deciphering reduction obtains the target data that described request of data is corresponding;
Second recovery module, if obtaining the failure of described target data for reduction, then extracts the second slice of data, and according to the first slice of data, the second slice of data and described current random number, deciphering reduction obtains described target data.
Preferably, described second sending module includes:
Matching unit, for being mated with the storage identification code of each back end by described current random number, wherein storage identification code is the coding of unique identification data node, and current random number is identical with storage identification code form;
Dispatching Unit, for being distributed to described ciphertext slice of data with current random number matching degree more than in the back end corresponding to the storage identification code of preset matching degree.
The present invention is by cutting into slices as multiple slice of datas by initial data, then according to the current random number generated, prestored secret key and preset AES, it is encrypted to obtain ciphertext slice of data to each slice of data, realize slice of data to transmit between the heart and back end in the data with ciphertext form, thus preventing lack of competence side from arbitrarily intercepting and capturing and cracking ciphertext slice of data, it is ensured that the Information Security of distributed data system slice of data;Meanwhile, ciphertext slice of data, according to current random number, is distributed to the back end storage of correspondence, improves the safety of slice of data on data storage location by data center at random.
Accompanying drawing explanation
Fig. 1 is the schematic flow sheet of distributed data processing method first embodiment of the present invention;
When Fig. 2 is that in distributed data processing method the second embodiment of the present invention, the heart receives initial data in the data, initial data is carried out data slicer and processes the refinement schematic flow sheet to obtain multiple slice of data;
Fig. 3 is that in distributed data processing method the 4th embodiment of the present invention, ciphertext slice of data, according to current random number, is distributed to the refinement schematic flow sheet of the back end storage of correspondence by data center;
Fig. 4 is the high-level schematic functional block diagram of data center of the present invention first embodiment;
Fig. 5 is the refinement high-level schematic functional block diagram of module of cutting into slices in data center of the present invention second embodiment;
Fig. 6 is the high-level schematic functional block diagram of data center the 3rd of the present invention embodiment;
Fig. 7 is the refinement high-level schematic functional block diagram of the second sending module in data center the 4th of the present invention embodiment.
The realization of the object of the invention, functional characteristics and advantage will in conjunction with the embodiments, are described further with reference to accompanying drawing.
Detailed description of the invention
Should be appreciated that specific embodiment described herein is only in order to explain the present invention, is not intended to limit the present invention.
The present invention provides a kind of distributed data processing method, and in the first embodiment of distributed data processing method of the present invention, with reference to Fig. 1, this distributed data processing method includes:
Step S10, when the heart receives initial data in the data, carries out data slicer and processes to obtain multiple slice of data initial data;
Data center can include at least one server, and data center obtains initial data by server, and these initial datas can be file or the packet etc. that user is connected by network or local connection is uploaded.When the heart receives the initial data needing storage in the data, this initial data is carried out slicing treatment, obtains multiple slice of data.
Step S20, data center generates a random number based on default randomizer, using this random number as current random number;
One randomizer is set inside data center, after initial data is carried out slicing treatment, controlling randomizer and generate a random number, this random number is as current random number, namely, after data center often receives a new initial data, a corresponding random number can all be generated.Alternatively, the randomizer preset is real random number generator, and namely randomizer is based on the random number that physical event obtains, rather than based on the random number that software algorithm obtains.
Step S30, current random number is sent to each back end in distributed data system by data center;
The current random number of generation is distributed in distributed data system each back end by data center, for each back end, the ciphertext slice of data encrypted based on current random number is decrypted and is preserved.
Step S40, each slice of data, according to current random number, prestored secret key and preset AES, is encrypted by data center, to obtain ciphertext slice of data;
Step S50, ciphertext slice of data, according to current random number, is distributed to the back end storage of correspondence by data center.
Prestored secret key and the unification of preset AES are stored in data center and each back end, data center is according to having the current random number of real-time, prestored secret key and preset AES (such as symmetric encipherment algorithm DES algorithm, 3DES algorithm etc., and rivest, shamir, adelman), each slice of data is encrypted, to obtain ciphertext slice of data, it is ensured that slice of data is to transmit between the form heart in the data and each back end of ciphertext.Meanwhile, ciphertext slice of data, according to current random number, is distributed to the back end storage of correspondence by data center, and namely the storage of ciphertext slice of data has the randomness corresponding with current random number, improves the safety of slice of data further on data storage location.
In the present embodiment, by initial data is cut into slices as multiple slice of datas, then according to the current random number generated, prestored secret key and preset AES, it is encrypted to obtain ciphertext slice of data to each slice of data, realize slice of data to transmit between the heart and back end in the data with ciphertext form, thus preventing lack of competence side from arbitrarily intercepting and capturing and cracking ciphertext slice of data, it is ensured that the Information Security of distributed data system slice of data;Meanwhile, ciphertext slice of data, according to current random number, is distributed to the back end storage of correspondence, improves the safety of slice of data on data storage location by data center at random.
Further, on the basis of distributed data processing method first embodiment of the present invention, it is proposed to distributed data processing method the second embodiment, with reference to Fig. 2, in a second embodiment, step S10 includes:
Step S11, when the heart receives initial data in the data, data center resolves the initial data obtained, to obtain the data parameters of this initial data;
Data parameters includes the data volume of initial data, data type, version information, whether supports the parameters such as slicing treatment, data volume represents the amount of capacity of initial data, data type includes picture data type, lteral data type, video data type etc., and version information represents the data slicer mode and version that initial data supports.
Step S12, the task that initial data is cut into slices, according to data parameters and data centrality energy, is divided into many height section task by data center;
Data center's data parameters according to initial data, judge whether initial data supports that data slicer processes, if supporting, data slicer processes, then continue to obtain the data volume of data center's performance (data slicer namely supporting how many data volumes within the unit interval processes) in data slicer and initial data, then the data volume according to data center's performance and data center, is divided into many height section task by the section task of initial data.The data volume of such as initial data is 100 units, and the data slicer performance of data center is the initial data of 10 units of cutting into slices 1 unit interval, then the section task of initial data can be divided into 10 sub-section tasks.Certainly, can also the comprehensive many factors such as the data parameters of initial data, data center's performance and each back end performance, determine the section task of initial data is divided into how many height section task, preferably, the task amount of every height section task is identical, facilitates scheduling and the management of sub-section task.
Step S13, the outer tune section task of preset ratio in sub-section task is distributed to the back end in distributed data system outside data center by data center;
Preset ratio can determine according to the section task amount (i.e. the performance of data center) that data center can undertake, local terminal is had little time to carry out the son section task (i.e. outer tune section task) of slicing treatment and distributes to distributed data system by data center, son section task (namely reserving sub-section task for one's own use) that can be processed by local terminal remains, and synchronizes initial data is carried out data slicer process for data center and back end.
Step S14, data center performs unappropriated in sub-section task to reserve sub-section task for one's own use, preserving the first corresponding slice of data, and obtain the second slice of data that back end execution outer tune section task obtains, wherein slice of data includes the first slice of data and the second slice of data.
Data center performs to reserve sub-section task for one's own use, it is achieved to initial data about the data slicer reserving sub-section task portion for one's own use, obtain multiple first slice of data;Simultaneously, data center controls the investigation mission outside the city or town section task that back end performs each to receive, obtain and preserve the second slice of data of multiple correspondences of correspondence, namely the slice of data that data center preserves is referred to as the first slice of data, the slice of data that each back end preserves is referred to as the second slice of data, it is achieved thereby that based on the data slicer of distributed storage.
nullIn the present embodiment,By first obtaining the data parameters of initial data,Then according to data parameters and data centrality energy,The section task of initial data will be divided into many height section task,And the son section task of preset ratio is distributed to other back end as outer tune section task,Then data center and the back end respective sub-section task of every execution,Thus data center obtains and preserves the first slice of data,Back end obtains and preserves the second slice of data,Thus realizing data center and initial data is carried out data slicer by data nodal parallel,Avoid only carrying out data slicer by the single-ended initial data to ultra-large type or mass data amount of data center,Thus shortening the time that initial data is carried out data slicer process,Avoid factor data slicing treatment and delay other process properly functioning of distributed data system,Ensure that the work efficiency of distributed data system.
Further, on the basis of distributed data processing method the second embodiment of the present invention, it is proposed to distributed data processing method the 3rd embodiment, in the third embodiment, also include after step S14:
Step S15, data center announces the first slice of data and the second slice of data in real time, for the client-access of distributed data system.
Data center announces the first slice of data of data center's preservation and the second slice of data of back end preservation in real time, for the client-access of distributed data system and download, such as initial data is cache map data, whole map is data cached very big, but desirably diagram data is only the part in cache map data every time, when map is data cached be sliced be saved in data center and different back end for multiple first slice of datas and the second slice of data time, data center announces each first slice of data and the second slice of data in real time, namely data center announces each slice map data of cache map data in real time, user is supplied based on the client-access of distributed data system and to download required slice map data, wherein there is the corresponding relation of each slice map data and its deposited address in data center.
In the present embodiment, data center is by announcing the first slice of data and the second slice of data in real time, it is easy to user based on the one or more slice of datas (i.e. the first slice of data and the second slice of data) needed for the client selection of distributed data system, after user determines required one or more slice of datas, data center is based on the division rule of sub-section task, slice of data user determined is reduced to the corresponding part of initial data, download for user, thus simplifying the mutual of distributed data system and user.
Preferably, also include after step S15:
Step S16, when receiving the request of data that client sends, data center is according to the first slice of data and current random number, and deciphering reduction obtains the target data that request of data is corresponding;
Step S17, if reduction obtains target data failure, then data center extracts the second slice of data, and according to the first slice of data, the second slice of data and current random number, deciphering reduction obtains target data.
When data center receives the request of data of client transmission of distributed data system, namely when user is to the slice of data that data center's extraction is required, the first slice of data that data center preferentially preserves according to local terminal, division rule based on sub-section task, first slice of data is reduced to target data corresponding to request of data (this target data is a part for initial data), if reducing successfully, data center is then by client corresponding for the target data transmission of generation to request of data;If reduction have failed, then data center extracts the second slice of data that back end preserves, and changes part rule according to the first slice of data, the second slice of data and son section, and reduction obtains target data corresponding in initial data.
In the present embodiment, when demand reduction slice of data is to obtain target data, first required slice of data is obtained from data center's local terminal, if the first slice of data of data center's local terminal cannot reduce target data, then can extract required slice of data (i.e. the second slice of data) from the back end of distributed data system, reduce and obtain target data, on the one hand, first the slice of data of reduction target data is obtained from data center, ensure that the efficiency of reduction target data, on the other hand, by the first slice of data and the second slice of data that can be reduced to target data are stored respectively in data center and back end, ensure that the safety that slice of data stores.
Additionally, also include after step S15:
Step S18, when distributed data system real-time bandwidth is more than pre-set bandwidths, carries out slice of data transmission between data center and back end.
When distributed data system real-time bandwidth is more than pre-set bandwidths, namely time distributed data system real-time bandwidth is notr busy, slice of data transmission can be carried out between data center and back end, keep on file to data center's preservation as back end will no longer preserve the second slice of data transmission, the second slice of data that and for example client of distributed data system is often extracted, this second slice of data is sent to data center's preservation by the back end of this second slice of data, in order to client is extracted often.
Preferably, the step carrying out slice of data transmission between data center and back end is:
When extracting the frequency of the second slice of data more than setpoint frequency, data center receives the second slice of data of back end transmission.
When the client of distributed data system asks the frequency of the second slice of data of a certain back end (such as back end A) more than setpoint frequency to data terminal, namely data center extracts the frequency of second slice of data of back end A more than when setting flat rate, data center controls back end and sends extraction frequency the second slice of data frequently to data center, data center receives and preserves back end A the second slice of data sent, to facilitate data center to extract this second slice of data more quickly to generate the target data of correspondence.
In the present embodiment, by when the current bandwidth of distributed data system is idle, just carry out the slice of data transmission between data center and back end, it is effectively utilized the bandwidth of distributed data system, other regular traffics affecting distributed data system are avoided to run, meanwhile, by the slice of data alternating transmission between data center and back end, it is achieved reasonable layout in the slice of data heart in the data and back end.
Further, on the basis of distributed data processing method first embodiment of the present invention, it is proposed to distributed data processing method the 4th embodiment, with reference to Fig. 3, in the fourth embodiment, step S50 includes:
Step S51, current random number is mated by data center with the storage identification code of each back end, and wherein storage identification code is the coding of unique identification data node, and current random number is identical with storage identification code form;
Step S52, ciphertext slice of data is distributed to current random number matching degree more than in the back end corresponding to the storage identification code of preset matching degree by data center.
After generating current random number, current random number is mated with the storage identification code of each back end, draw the storage identification code of each back end and the matching degree of current random number, pre-set preset matching degree, filter out and the current random number matching degree storage identification code more than preset matching degree according to preset matching degree, then determine corresponding data address of node according to the storage identification code filtered out, then ciphertext slice of data is distributed to the back end that the storage identification code filtered out is corresponding by the address that heel distance is determined.Such as storage identification code is six natural numbers, then current random number is also six natural numbers, thus the natural number numerical value storing identification code is more little with the natural number quantity difference of current random number, then storage identification code is more high with the matching degree of current random number.
nullIn the present embodiment,Current random number is mated with the storage identification code of each back end,Draw each back end matching degree about own storage identification code Yu current random number,And select the matching degree back end more than preset matching degree,Finally ciphertext slice of data is distributed to current random number matching degree more than in the back end corresponding to the storage identification code of preset matching degree,Thus randomly slice data are stored in different back end according to current random number,Slice of data is stored at random at data storage location,Improve unauthorized party's difficulty based on slice of data reduction initial data,Improve the safety of slice of data on the data store,Simultaneously,Slice of data stores at random based on current random number,Avoid being centrally stored on single or a few back end,Achieve making full use of of the storage resource of the back end in distributed system.
The present invention also provides for a kind of data center, and in the data in heart first embodiment, with reference to Fig. 4, this data center includes:
Section module 10, for, when the heart receives initial data in the data, carrying out data slicer and process to obtain multiple slice of data to initial data;
Data center can include at least one server, and data center obtains initial data by server, and these initial datas can be file or the packet etc. that user is connected by network or local connection is uploaded.When the heart receives the initial data needing storage in the data, this initial data is carried out slicing treatment by section module 10, obtains multiple slice of data.
Random number module 20, for generating a random number based on default randomizer, using this random number as current random number;
One randomizer is set inside data center, after initial data is carried out slicing treatment, random number module 20 controls randomizer and generates a random number, this random number is as current random number, namely, after data center often receives a new initial data, a corresponding random number can all be generated.Alternatively, the randomizer preset is real random number generator, and namely randomizer is based on the random number that physical event obtains, rather than based on the random number that software algorithm obtains.
First sending module 30, for sending current random number to each back end in distributed data system;
The current random number of generation is distributed in distributed data system each back end by the first sending module 30, for each back end, the ciphertext slice of data encrypted based on current random number is decrypted and is preserved.
Encrypting module 40, for according to current random number, prestored secret key and preset AES, being encrypted each slice of data, to obtain ciphertext slice of data;
Second sending module 50, for according to current random number, being distributed to the back end storage of correspondence by ciphertext slice of data.
Prestored secret key and the unification of preset AES are stored in data center and each back end, encrypting module 40 is according to having the current random number of real-time, prestored secret key and preset AES (such as symmetric encipherment algorithm DES algorithm, 3DES algorithm etc., and rivest, shamir, adelman), each slice of data is encrypted, to obtain ciphertext slice of data, it is ensured that slice of data is to transmit between the form heart in the data and each back end of ciphertext.Simultaneously, second sending module 50 is according to current random number, ciphertext slice of data is distributed to the back end storage of correspondence, and namely the storage of ciphertext slice of data has the randomness corresponding with current random number, improves the safety of slice of data further on data storage location.
In the present embodiment, by cutting into slices, initial data is cut into slices as multiple slice of datas by module 10, then encrypting module 40 is according to the current random number generated, prestored secret key and preset AES, it is encrypted to obtain ciphertext slice of data to each slice of data, realize slice of data to transmit between the heart and back end in the data with ciphertext form, thus preventing lack of competence side from arbitrarily intercepting and capturing and cracking ciphertext slice of data, it is ensured that the Information Security of distributed data system slice of data;Meanwhile, ciphertext slice of data, according to current random number, is distributed to the back end storage of correspondence, improves the safety of slice of data on data storage location by the second sending module 50 at random.
Further, on the basis of data center of the present invention first embodiment, it is proposed to data center's the second embodiment, with reference to Fig. 5, in a second embodiment, section module 10 includes:
Resolution unit 11, for, when the heart receives initial data in the data, resolving the initial data obtained, to obtain the data parameters of this initial data;
Data parameters includes the data volume of initial data, data type, version information, whether supports the parameters such as slicing treatment, data volume represents the amount of capacity of initial data, data type includes picture data type, lteral data type, video data type etc., and version information represents the data slicer mode and version that initial data supports.
Task division unit 12, for according to data parameters and data centrality energy, being divided into many height section task by the task that initial data is cut into slices;
The task division unit 12 data parameters according to initial data, judge whether initial data supports that data slicer processes, if supporting, data slicer processes, then task division unit 12 continues to obtain the data volume of data center's performance (data slicer namely supporting how many data volumes within the unit interval processes) in data slicer and initial data, then the data volume according to data center's performance and data center, is divided into many height section task by the section task of initial data.The data volume of such as initial data is 100 units, and the data slicer performance of data center is the initial data of 10 units of cutting into slices 1 unit interval, then the section task of initial data can be divided into 10 sub-section tasks.Certainly, can also the comprehensive many factors such as the data parameters of initial data, data center's performance and each back end performance, determine the section task of initial data is divided into how many height section task, preferably, the task amount of every height section task is identical, facilitates scheduling and the management of sub-section task.
Task allocation unit 13, for distributing to the back end in distributed data system outside data center by the outer tune section task of preset ratio in sub-section task;
Preset ratio can determine according to the section task amount (i.e. the performance of data center) that data center can undertake, data center is had little time to carry out the son section task (i.e. outer tune section task) of slicing treatment and distributes to distributed data system by task allocation unit 13, son section task (namely reserving sub-section task for one's own use) that can be processed by local terminal remains, and synchronizes initial data is carried out data slicer process for data center and back end.
Task executing units 14, for performing unappropriated in sub-section task to reserve sub-section task for one's own use, preserving the first corresponding slice of data, and obtain the second slice of data that back end execution outer tune section task obtains, wherein slice of data includes the first slice of data and the second slice of data.
Task executing units 14 performs to reserve sub-section task for one's own use, it is achieved to initial data about the data slicer reserving sub-section task portion for one's own use, obtain multiple first slice of data;Simultaneously, task executing units 14 controls the investigation mission outside the city or town section task that back end performs each to receive, obtain and preserve the second slice of data of multiple correspondences of correspondence, namely the slice of data that task executing units 14 preserves is referred to as the first slice of data, the slice of data that each back end preserves is referred to as the second slice of data, it is achieved thereby that based on the data slicer of distributed storage.
nullIn the present embodiment,The data parameters of initial data is first obtained by resolution unit 11,Then task division unit 12 is according to data parameters and data centrality energy,The section task of initial data will be divided into many height section task,The son section task of preset ratio is distributed to other back end as outer tune section task by task allocation unit 13,Then task executing units 14 controls data center and the back end respective sub-section task of every execution,Thus data center obtains and preserves the first slice of data,Back end obtains and preserves the second slice of data,Thus realizing data center and initial data is carried out data slicer by data nodal parallel,Avoid only carrying out data slicer by the single-ended initial data to ultra-large type or mass data amount of data center,Thus shortening the time that initial data is carried out data slicer process,Avoid factor data slicing treatment and delay other process properly functioning of distributed data system,Ensure that the work efficiency of distributed data system.
Further, on the basis of data center of the present invention second embodiment, it is proposed to data center the 3rd embodiment, with reference to Fig. 6, in the third embodiment, data center also includes:
Announce module 60, announce the first slice of data and the second slice of data in real time for data center, for the client-access of distributed data system.
nullAnnounce module 60 and announce the first slice of data of data center's preservation and the second slice of data of back end preservation in real time,For the client-access of distributed data system and download,Such as initial data is cache map data,Whole map is data cached very big,But desirably diagram data is only the part in cache map data every time,When map is data cached be sliced be saved in data center and different back end for multiple first slice of datas and the second slice of data time,The announcement module 60 of data center announces each first slice of data and the second slice of data in real time,Namely the announcement module 60 of data center announces each slice map data of cache map data in real time,User is supplied based on the client-access of distributed data system and to download required slice map data,Wherein there is the corresponding relation of each slice map data and its deposited address in data center.
In the present embodiment, data center announces the first slice of data and the second slice of data in real time by announcing module 60, it is easy to user based on the one or more slice of datas (i.e. the first slice of data and the second slice of data) needed for the client selection of distributed data system, after user determines required one or more slice of datas, data center is based on the division rule of sub-section task, slice of data user determined is reduced to the corresponding part of initial data, download for user, thus simplifying the mutual of distributed data system and user.
Preferably, data center also includes:
First recovery module 70, for when receiving the request of data that client sends, according to the first slice of data and current random number, deciphering reduction obtains the target data that request of data is corresponding;
Second recovery module 80, if obtaining target data failure for reduction, then extracts the second slice of data, and according to the first slice of data, the second slice of data and current random number, deciphering reduction obtains target data.
When data center receives the request of data of client transmission of distributed data system, namely when user is to the slice of data that data center's extraction is required, first preferential the first slice of data preserved according to local terminal of recovery module 70, division rule based on sub-section task, first slice of data is reduced to target data corresponding to request of data (this target data is a part for initial data), if reducing successfully, the first recovery module 70 is then by client corresponding for the target data transmission of generation to request of data;If reduction have failed, then the second recovery module 80 extracts the second slice of data that back end preserves, and changes part rule according to the first slice of data, the second slice of data and son section, and reduction obtains target data corresponding in initial data.
In the present embodiment, when demand reduction slice of data is to obtain target data, first the first recovery module 70 obtains required slice of data from data center's local terminal, if the first slice of data of data center's local terminal cannot reduce target data, then the second recovery module 80 can extract required slice of data (i.e. the second slice of data) from the back end of distributed data system, reduce and obtain target data, on the one hand, first the slice of data of reduction target data is obtained from data center, ensure that the efficiency of reduction target data, on the other hand, by the first slice of data and the second slice of data that can be reduced to target data are stored respectively in data center and back end, ensure that the safety that slice of data stores.
Additionally, data center also includes:
Slice transmission module 90, for when distributed data system real-time bandwidth is more than pre-set bandwidths, controlling to carry out slice of data transmission between data center and back end.
When distributed data system real-time bandwidth is more than pre-set bandwidths, namely time distributed data system real-time bandwidth is notr busy, slice transmission module 90 controls can carry out between data center and back end slice of data transmission, keep on file to data center's preservation as back end will no longer preserve the second slice of data transmission, the second slice of data that and for example client of distributed data system is often extracted, this second slice of data is sent to data center's preservation by the back end of this second slice of data, in order to client is extracted often.
Preferably, slice transmission module 90 is additionally operable to:
When extracting the frequency of the second slice of data more than setpoint frequency, data center receives the second slice of data of back end transmission.
When the client of distributed data system asks the frequency of the second slice of data of a certain back end (such as back end A) more than setpoint frequency to data terminal, namely data center extracts the frequency of second slice of data of back end A more than when setting flat rate, slice transmission module 90 controls back end and sends extraction frequency the second slice of data frequently to data center, slice transmission module 90 receives and preserves back end A the second slice of data sent, to facilitate data center to extract this second slice of data more quickly to generate the target data of correspondence.
In the present embodiment, by when the current bandwidth of distributed data system is idle, slice transmission module 90 just carries out the slice of data transmission between data center and back end, it is effectively utilized the bandwidth of distributed data system, other regular traffics affecting distributed data system are avoided to run, meanwhile, by the slice of data alternating transmission between data center and back end, it is achieved reasonable layout in the slice of data heart in the data and back end.
Further, on the basis of data center of the present invention first embodiment, it is proposed to data center the 4th embodiment, in the fourth embodiment, with reference to Fig. 7, the second sending module 50 includes:
Matching unit 51, for being mated with the storage identification code of each back end by current random number, wherein storage identification code is the coding of unique identification data node, and current random number is identical with storage identification code form;
Dispatching Unit 52, for being distributed to ciphertext slice of data with current random number matching degree more than in the back end corresponding to the storage identification code of preset matching degree.
After generating current random number, current random number is mated by matching unit 51 with the storage identification code of each back end, draw the storage identification code of each back end and the matching degree of current random number, pre-set preset matching degree, matching unit 51 filters out and the current random number matching degree storage identification code more than preset matching degree according to preset matching degree, then matching unit 51 determines corresponding data address of node according to the storage identification code filtered out, ciphertext slice of data is distributed to the back end that the storage identification code filtered out is corresponding by the address that Dispatching Unit 52 heel distance again is determined.Such as storage identification code is six natural numbers, then current random number is also six natural numbers, thus the natural number numerical value storing identification code is more little with the natural number quantity difference of current random number, then storage identification code is more high with the matching degree of current random number.
nullIn the present embodiment,Current random number is mated by matching unit 51 with the storage identification code of each back end,Draw each back end matching degree about own storage identification code Yu current random number,Matching unit 51 selects the matching degree back end more than preset matching degree,Ciphertext slice of data is distributed to current random number matching degree more than in the back end corresponding to the storage identification code of preset matching degree by last Dispatching Unit 52,Thus randomly slice data are stored in different back end according to current random number,Slice of data is stored at random at data storage location,Improve unauthorized party's difficulty based on slice of data reduction initial data,Improve the safety of slice of data on the data store,Simultaneously,Slice of data stores at random based on current random number,Avoid being centrally stored on single or a few back end,Achieve making full use of of the storage resource of the back end in distributed system.
Through the above description of the embodiments, those skilled in the art is it can be understood that can add the mode of required general hardware platform by software to above-described embodiment method and realize, hardware can certainly be passed through, but in a lot of situation, the former is embodiment more preferably.Based on such understanding, the part that prior art is contributed by technical scheme substantially in other words can embody with the form of software product, this computer software product is stored in a storage medium (such as ROM/RAM, magnetic disc, CD), including some instructions with so that a station terminal equipment (can be mobile phone, computer, server, air-conditioner, or the network equipment etc.) perform each embodiment of the present invention method.
These are only the preferred embodiments of the present invention; not thereby the scope of the claims of the present invention is limited; every equivalent structure utilizing description of the present invention and accompanying drawing content to make or equivalence flow process conversion; or directly or indirectly it is used in other relevant technical fields, all in like manner include in the scope of patent protection of the present invention.

Claims (10)

1. a distributed data processing method, it is characterised in that described distributed data processing method includes:
When the heart receives initial data in the data, described initial data is carried out data slicer and processes to obtain multiple slice of data;
Data center generates a random number based on default randomizer, using this random number as current random number;
Current random number is sent to each back end in distributed data system by data center;
Each slice of data, according to current random number, prestored secret key and preset AES, is encrypted by data center, to obtain ciphertext slice of data;
Described ciphertext slice of data, according to current random number, is distributed to the back end storage of correspondence by data center.
2. distributed data processing method as claimed in claim 1, it is characterised in that described when the heart receives initial data in the data, carries out the step that data slicer processes to obtain multiple slice of data and includes described initial data:
When the heart receives initial data in the data, data center resolves the initial data obtained, to obtain the data parameters of this initial data;
The task that initial data is cut into slices, according to described data parameters and data centrality energy, is divided into many height section task by data center;
The outer tune section task of preset ratio in described sub-section task is distributed to the back end in distributed data system outside data center by data center;
Data center performs unappropriated in described sub-section task to reserve sub-section task for one's own use, preserve the first corresponding slice of data, and obtain the second slice of data that back end execution described outer tune section task obtains, wherein slice of data includes the first slice of data and the second slice of data.
3. distributed data processing method as claimed in claim 2, it is characterized in that, described data center performs unappropriated in described sub-section task to reserve sub-section task for one's own use, preserve the first corresponding slice of data, and also include after obtaining the step that back end performs the second slice of data that described outer tune section task obtains:
Data center announces the first slice of data and the second slice of data in real time, for the client-access of distributed data system.
4. distributed data processing method as claimed in claim 3, it is characterised in that described data center announces the first slice of data and the second slice of data in real time, also includes for after the step of the client-access of distributed data system:
When receiving the request of data that client sends, data center is according to described first slice of data and described current random number, and deciphering reduction obtains the target data that described request of data is corresponding;
If reduction obtains the failure of described target data, then data center extracts the second slice of data, and according to the first slice of data, the second slice of data and described current random number, deciphering reduction obtains described target data.
5. the distributed data processing method as described in Claims 1-4 any one, it is characterised in that described data center is according to current random number, and the step of the back end storage that described ciphertext slice of data is distributed to correspondence includes:
Described current random number is mated by data center with the storage identification code of each back end, and wherein storage identification code is the coding of unique identification data node, and current random number is identical with storage identification code form;
Described ciphertext slice of data is distributed to current random number matching degree more than in the back end corresponding to the storage identification code of preset matching degree by data center.
6. a data center, it is characterised in that described data center includes:
Section module, for, when the heart receives initial data in the data, carrying out data slicer and process to obtain multiple slice of data to described initial data;
Random number module, for generating a random number based on default randomizer, using this random number as current random number;
First sending module, for sending current random number to each back end in distributed data system;
Encrypting module, for according to current random number, prestored secret key and preset AES, being encrypted each slice of data, to obtain ciphertext slice of data;
Second sending module, for according to current random number, being distributed to the back end storage of correspondence by described ciphertext slice of data.
7. data center as claimed in claim 6, it is characterised in that described section module includes:
Resolution unit, for, when the heart receives initial data in the data, resolving the initial data obtained, to obtain the data parameters of this initial data;
Task division unit, for according to described data parameters and data centrality energy, being divided into many height section task by the task that initial data is cut into slices;
Task allocation unit, for distributing to the back end in distributed data system outside data center by the outer tune section task of preset ratio in described sub-section task;
Task executing units, for performing unappropriated in described sub-section task to reserve sub-section task for one's own use, preserve the first corresponding slice of data, and obtain the second slice of data that back end execution described outer tune section task obtains, wherein slice of data includes the first slice of data and the second slice of data.
8. data center as claimed in claim 7, it is characterised in that described data center also includes:
Announce module, announce the first slice of data and the second slice of data in real time for data center, for the client-access of distributed data system.
9. data center as claimed in claim 8, it is characterised in that described data center also includes:
First recovery module, for when receiving the request of data that client sends, according to described first slice of data and described current random number, deciphering reduction obtains the target data that described request of data is corresponding;
Second recovery module, if obtaining the failure of described target data for reduction, then extracts the second slice of data, and according to the first slice of data, the second slice of data and described current random number, deciphering reduction obtains described target data.
10. the data center as described in claim 6 to 9 any one, it is characterised in that described second sending module includes:
Matching unit, for being mated with the storage identification code of each back end by described current random number, wherein storage identification code is the coding of unique identification data node, and current random number is identical with storage identification code form;
Dispatching Unit, for being distributed to described ciphertext slice of data with current random number matching degree more than in the back end corresponding to the storage identification code of preset matching degree.
CN201610271917.4A 2016-04-27 2016-04-27 Distributed data processing method and data center Pending CN105791434A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610271917.4A CN105791434A (en) 2016-04-27 2016-04-27 Distributed data processing method and data center

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610271917.4A CN105791434A (en) 2016-04-27 2016-04-27 Distributed data processing method and data center

Publications (1)

Publication Number Publication Date
CN105791434A true CN105791434A (en) 2016-07-20

Family

ID=56398851

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610271917.4A Pending CN105791434A (en) 2016-04-27 2016-04-27 Distributed data processing method and data center

Country Status (1)

Country Link
CN (1) CN105791434A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110704858A (en) * 2019-10-16 2020-01-17 长春银彩通信息科技有限公司 Data security storage method and system under distributed environment
CN112988764A (en) * 2021-05-14 2021-06-18 北京百度网讯科技有限公司 Data storage method, device, equipment and storage medium
CN113095781A (en) * 2021-04-12 2021-07-09 山东大卫国际建筑设计有限公司 Temperature control equipment control method, equipment and medium based on edge calculation
CN113268775A (en) * 2021-07-16 2021-08-17 深圳市永兴元科技股份有限公司 Photo processing method, device and system and computer readable storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1592877A (en) * 2001-09-28 2005-03-09 高密度装置公司 Method and device for encryption/decryption of data on mass storage device
CN101682502A (en) * 2007-06-15 2010-03-24 国际商业机器公司 Method and system for encryption of blocks of data
US20130094650A1 (en) * 2011-10-18 2013-04-18 Broadcom Corporation Secure data transfer using random ordering and random block sizing
CN103327085A (en) * 2013-06-05 2013-09-25 深圳市中博科创信息技术有限公司 Distributed data processing method, data center and distributed data system
CN103583030A (en) * 2011-05-25 2014-02-12 阿尔卡特朗讯公司 Method and apparatus for achieving data security in a distributed cloud computing environment
CN104598320A (en) * 2015-01-30 2015-05-06 北京正奇联讯科技有限公司 Task execution method and system based on distributed system

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1592877A (en) * 2001-09-28 2005-03-09 高密度装置公司 Method and device for encryption/decryption of data on mass storage device
CN101682502A (en) * 2007-06-15 2010-03-24 国际商业机器公司 Method and system for encryption of blocks of data
CN103583030A (en) * 2011-05-25 2014-02-12 阿尔卡特朗讯公司 Method and apparatus for achieving data security in a distributed cloud computing environment
US20130094650A1 (en) * 2011-10-18 2013-04-18 Broadcom Corporation Secure data transfer using random ordering and random block sizing
CN103327085A (en) * 2013-06-05 2013-09-25 深圳市中博科创信息技术有限公司 Distributed data processing method, data center and distributed data system
CN104598320A (en) * 2015-01-30 2015-05-06 北京正奇联讯科技有限公司 Task execution method and system based on distributed system

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110704858A (en) * 2019-10-16 2020-01-17 长春银彩通信息科技有限公司 Data security storage method and system under distributed environment
CN113095781A (en) * 2021-04-12 2021-07-09 山东大卫国际建筑设计有限公司 Temperature control equipment control method, equipment and medium based on edge calculation
CN112988764A (en) * 2021-05-14 2021-06-18 北京百度网讯科技有限公司 Data storage method, device, equipment and storage medium
CN113268775A (en) * 2021-07-16 2021-08-17 深圳市永兴元科技股份有限公司 Photo processing method, device and system and computer readable storage medium

Similar Documents

Publication Publication Date Title
CN107734021B (en) Block chain data uploading method and system, computer system and storage medium
CN105847279A (en) Distributed data processing method and data center
CN105791434A (en) Distributed data processing method and data center
CN104052742A (en) Internet of things communication protocol capable of being encrypted dynamically
CN102148798A (en) Method for efficiently, parallelly and safely encrypting and decrypting high-capacity data packets
CN103457727A (en) Method, device and system for processing media data
CN105120530B (en) Method and device for acquiring data and data acquisition system
CN101605108A (en) A kind of method, system and device of instant messaging
CN111372253A (en) Cell access method, device, system and computer readable storage medium
CN102055580A (en) Method for safely sending and receiving enterprise information in industrial internet and communication equipment
CN102571321A (en) Data encryption transmission method and device
CN102833257A (en) Operation request queuing method, associated equipment and system
CN117077123A (en) Service processing method and device for multiple password cards and electronic equipment
CN106452752B (en) Method, system and the client of Modify password, server and smart machine
CN102045343A (en) DC (Digital Certificate) based communication encrypting safety method, server and system
EP3166283A1 (en) Business access method, system and device
CN111552938B (en) File encryption method and device
CN105893135B (en) Distributed data processing method and data center
CN110598427B (en) Data processing method, system and storage medium
CN116094815B (en) Data encryption processing method and device based on flow self-adaptive control adjustment
CN104065479A (en) Key generation method and system and key distribution method and system based on group
CN105872013A (en) Cloud computing system
CN106878266B (en) A kind of unstructured data Transmission system
CN106454435B (en) Conditional access method and related equipment and system
CN111092866B (en) Key management method and device based on Hadoop

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 518057 Shenzhen Software Park, Nanshan District high tech Industrial Park, Guangdong, China,, 6 401-402

Applicant after: Yongxing Shenzhen Polytron Technologies Inc

Address before: 518057 Shenzhen Software Park, Nanshan District high tech Industrial Park, Guangdong, China,, 6 401-402

Applicant before: Shenzhen Longrise Technology Co., Ltd.

COR Change of bibliographic data
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20160720