CN116301668A - CDP-based data storage system and method - Google Patents
CDP-based data storage system and method Download PDFInfo
- Publication number
- CN116301668A CN116301668A CN202310593509.0A CN202310593509A CN116301668A CN 116301668 A CN116301668 A CN 116301668A CN 202310593509 A CN202310593509 A CN 202310593509A CN 116301668 A CN116301668 A CN 116301668A
- Authority
- CN
- China
- Prior art keywords
- data
- cdp
- unit
- target
- client
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000013500 data storage Methods 0.000 title claims abstract description 33
- 238000000034 method Methods 0.000 title claims abstract description 13
- 230000002457 bidirectional effect Effects 0.000 claims abstract description 10
- 238000011084 recovery Methods 0.000 claims description 51
- 238000012549 training Methods 0.000 claims description 40
- 230000002159 abnormal effect Effects 0.000 claims description 23
- 238000007726 management method Methods 0.000 claims description 22
- 238000000638 solvent extraction Methods 0.000 claims description 18
- 238000012795 verification Methods 0.000 claims description 12
- 238000013527 convolutional neural network Methods 0.000 claims description 5
- 238000012217 deletion Methods 0.000 description 4
- 230000037430 deletion Effects 0.000 description 4
- 241000700605 Viruses Species 0.000 description 3
- 238000011478 gradient descent method Methods 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 230000006978 adaptation Effects 0.000 description 1
- 238000012550 audit Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000002085 persistent effect Effects 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/0604—Improving or facilitating administration, e.g. storage management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/14—Error detection or correction of the data by redundancy in operation
- G06F11/1402—Saving, restoring, recovering or retrying
- G06F11/1446—Point-in-time backing up or restoration of persistent data
- G06F11/1448—Management of the data involved in backup or backup restore
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/14—Error detection or correction of the data by redundancy in operation
- G06F11/1402—Saving, restoring, recovering or retrying
- G06F11/1446—Point-in-time backing up or restoration of persistent data
- G06F11/1458—Management of the backup or restore process
- G06F11/1469—Backup restoration techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/602—Providing cryptographic facilities or services
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/0614—Improving the reliability of storage systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/062—Securing storage systems
- G06F3/0622—Securing storage systems in relation to access
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Human Computer Interaction (AREA)
- Quality & Reliability (AREA)
- Health & Medical Sciences (AREA)
- Bioethics (AREA)
- General Health & Medical Sciences (AREA)
- Computer Hardware Design (AREA)
- Computer Security & Cryptography (AREA)
- Software Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a CDP-based data storage system and a CDP-based data storage method, which belong to the technical field of data storage, realize bidirectional data backup, recover data of any version at any time, and acquire data of a corresponding version from a CDP backup center when client data is damaged, so that effective data backup is realized, the data is easier to recover, and network security protection is also performed when the data is recovered from the CDP backup center, thereby realizing safe data storage.
Description
Technical Field
The invention belongs to the technical field of data storage, and particularly relates to a CDP-based data storage system and method.
Background
At present, with the rapid development of computers and networks, data are closely related to artificial life and work, data security is becoming more and more important, existing data storage often adopts a method of storing the data in a local hard disk or manually backing up and storing the data at intervals, and the data stored in the local hard disk often cannot be recovered after viruses in the computer are adopted. By adopting the manual backup storage method, the defect that some data cannot be backed up and stored in real time exists, and the data which cannot be backed up after viruses in a computer cannot be recovered.
Disclosure of Invention
The invention provides a CDP-based data storage system and a CDP-based data storage method, which are used for solving the problems in the prior art.
In a first aspect, the present invention provides a CDP-based data storage system, including a CDP proxy module, a CDP backup center, a CDP restoration center, and a security protection center, where the CDP proxy module is disposed on a client;
the CDP proxy module is used for configuring a data area to be protected on the client and protecting data in the data area;
the CDP backup center is used for carrying out secondary backup on a data area to be protected on the client so as to realize bidirectional data backup of data;
the CDP recovery center is used for acquiring target version data corresponding to the data recovery request from the CDP backup center according to the data recovery request of the client and transmitting the target version data to the client;
the safety protection center is used for carrying out network safety protection when the CDP proxy module backs up the data of each version to the CDP backup center or the CDP recovery center responds to the data recovery request of the client.
Further, the CDP proxy module comprises a first configuration management unit, a file interception unit, a first inverse operation acquisition unit and a first delta data acquisition unit;
The first configuration management unit is configured to configure a data area to be protected on the client and an IP address of the CDP backup center, and send a data protection request to the CDP backup center according to the IP address of the CDP backup center, where the data protection request includes the data area to be protected on the client;
the file interception unit is used for intercepting data operation, target data before the operation and target data after the operation in a data area to be protected by the client, and synchronizing the data operation, the target data before the operation and the target data after the operation to the CDP backup center;
the first inverse operation obtaining unit is used for obtaining the inverse operation of the data operation when the data operation is not the writing operation, generating a version number of the target data, and storing the inverse operation and the version number in association with the target data after the operation;
the first delta data obtaining unit is used for determining that the inverse operation of the data operation is a deleting operation when the data operation is a writing operation, obtaining the first delta data according to the target data before the operation and the target data after the operation, generating a version number of the target data, and storing the deleting operation, the first delta data and the version number in association with the target data after the operation.
Further, the obtaining the first delta data according to the target data before the operation and the target data after the operation includes:
partitioning target data before operation in a fixed-length partitioning or unequal-length partitioning mode to obtain N first data blocks, and sequentially numbering the first data blocks;
partitioning the target data after the operation in a fixed-length partitioning or unequal-length partitioning mode to obtain N second data blocks, and sequentially numbering the second data blocks, wherein each first data block corresponds to one second data block;
acquiring a first weak check value and a first MD5 check value corresponding to a first data block, and acquiring a second weak check value and a second MD5 check value corresponding to a second data block;
judging whether the first weak check value of the first data block is the same as the second weak check value of the second data block corresponding to the first data block, if so, entering an MD5 check flow, otherwise, taking the second data block as first differential data;
judging whether a first MD5 check value of a first data block is identical to a second MD5 check value of a second data block corresponding to the first data block, if so, the two data blocks are identical, otherwise, the second data block is used as first differential data;
All the first data blocks and the second data blocks are traversed to acquire first delta data between target data before operation and target data after operation.
Further, the CDP backup center includes a second configuration management unit, an operation synchronization unit, a second delta data acquisition unit, a second inverse operation acquisition unit, a storage unit, and a version management unit;
the second configuration management unit is used for receiving the data protection request sent by the first configuration management unit and establishing connection between the CDP proxy module and the CDP backup center according to the data area to be protected on the client side in the data protection request;
the operation synchronization unit is used for receiving the data operation transmitted by the file interception unit, the target data before the operation and the target data after the operation;
the second delta data obtaining unit is used for determining that the inverse operation of the data operation is a deleting operation when the data operation is a writing operation, obtaining second delta data according to target data before the operation and target data after the operation, generating a version number of the target data, and storing the deleting operation, the second delta data and the version number in a storage unit in association with the target data after the operation;
The second inverse operation obtaining unit is configured to obtain an inverse operation of the data operation when the data operation is not a write operation, generate a version number of the target data, and store the inverse operation and the version number in association with the target data after the operation in the storage unit;
the storage unit is used for storing the data generated by the second differential data acquisition unit and the second inverse operation acquisition unit;
and the version management unit is used for judging whether the version number generated by the CDP proxy module is the same as the version number stored in the storage unit in the current backup, if so, ending the data backup, otherwise, synchronizing the version number.
Further, the CDP recovery center comprises a request receiving unit, a request checking unit and a data recovery unit;
the request receiving unit is used for receiving a data recovery request transmitted by a client, wherein the data recovery request comprises a target version number of data to be recovered and a current version number of the data to be recovered;
the request checking unit is used for acquiring current data corresponding to the current version number, inverse operation between the target version number and the current version number and first delta data from the client, acquiring the current data corresponding to the current version number, inverse operation between the target version number and the current version number and second delta data from the storage unit, judging whether the inverse operation acquired from the client, the inverse operation acquired from the first delta data and the inverse operation acquired from the storage unit and the second delta data are the same, if yes, generating a checking result to be checked, scheduling the data recovery unit to recover the data, otherwise, generating a checking result to be checked, and scheduling the data recovery unit to recover the data;
The data recovery unit is used for directly and remotely scheduling the current data in the client, the inverse operation between the target version number and the current version number and the first delta data when the verification result is that the verification is passed, and carrying out data recovery; and when the verification result is that the verification fails, directly and remotely scheduling the current data, the inverse operation between the target version number and the current version number and the second delta data in the storage unit, and carrying out data recovery.
Further, the safety protection center comprises a safety protection model training unit and a network safety protection unit;
the safety protection model training unit is used for acquiring historical flow data and abnormal recognition results corresponding to the historical flow data, wherein the historical flow data and the abnormal recognition results corresponding to the historical flow data are prestored data, constructing a safety protection model by adopting a convolutional neural network, and training the safety protection model by adopting the historical flow data and the abnormal recognition results corresponding to the historical flow data to obtain a trained safety protection model;
the network safety protection unit is used for carrying out abnormal identification on the flow characteristics by adopting a trained safety protection model when the CDP agent module backs up the data of each version to the CDP backup center or the CDP recovery center responds to the data recovery request of the client, so as to obtain an abnormal identification result corresponding to the flow characteristics.
Further, training the safety protection model by adopting historical flow data and an abnormal recognition result corresponding to the historical flow data to obtain a trained safety protection model, including:
initializing network parameters of a safety protection model by adopting a chaos sequence strategy to obtain parameter individuals, and obtaining a plurality of parameter individuals;
taking the data normalized by the historical flow data as input data of a safety protection model, acquiring actual output data of the safety protection model, and taking an abnormal recognition result corresponding to the historical flow data as expected output data to acquire an adaptability value of each parameter individual;
taking the parameter individual with the maximum fitness value as the optimal individual, and carrying out global search on all the parameter individuals on the basis of the optimal individual to obtain the parameter individual after primary updating;
further carrying out local search on the parameter individuals after primary updating to obtain parameter individuals after secondary updating;
determining a final updated value of each parameter individual based on the parameter individual after primary updating and the parameter individual after secondary updating;
based on the parameter individuals updated by the final updated values, judging whether the fitness value of the parameter individuals is larger than a set threshold or the current updating times T is larger than the maximum updating times T, if so, taking the parameter individuals with the largest fitness value as final network parameters of the safety protection model to obtain the safety protection model after training, otherwise, entering the next training.
Further, the fitness value of each parameter individual is:
wherein ,indicating fitness values, i=1, 2, …, I indicating parameter individual total number, p=1, 2, …, P indicating historical flow data total number, k=1, 2, …, K indicating safety protection model output total number; />Represents the kth actual output when the p-th historical flow data is input, < >>Represents the kth expected output when the p-th historical traffic data is input;
the global search is specifically:
wherein ,indicating the update frequency of the ith parameter individual, < +.>Representing the upper limit of update frequency,/-, for>Representing the update frequency lower bound->Representing a random number between (0, 1); />Represents the update step size of the ith parameter individual at the time of the t-th training, and +.>Update after crossing boundary>;/>Representing the updated step size of the ith parameter individual at the time of t-1 training,representing inertial weights, ++>Represents the ith parameter individual of the t-1 th training,/th parameter individual of the t>Represents the best individual at training time t-1, < ->Indicating a preset maximum number of updates, < >>Representing updated->。
Further, the parameter individuals after the secondary updating are:
wherein ,representing updated->,/>Represents update coefficients, and->Is [ -1,1]Constant between->Representation of Average loudness of all parameter individuals in the training, and +.>Represents a random number between (0, 1, ">Representing the pulse transmitting frequency corresponding to the ith parameter individual in the t-1 th training;
the final updated value for each parameter individual is:
wherein ,representing the final updated value of the ith parameter individual in the current update,/for the current update>Indicating the loudness of the ith parameter individual at the time of the t-1 training, +.>Representing parameter individual->Is adapted to the value of->Representing the maximum fitness value corresponding to the parameter individual after the initial update,/for the parameter individual>Representing updated->Initial +.>0.95%>Represents the attenuation coefficient between (0, 1, ">Representing updated->,/>Greater than 0->Representing an increase factor>An initial pulse emission frequency representing the ith parameter individual, and +.>。
In a second aspect, the present invention provides a CDP-based data storage method, including:
establishing connection between the CDP proxy module and the CDP backup center, and completely backing up the data in the data area to be protected configured on the client side by the CDP proxy module to the CDP backup center;
when the CDP proxy module monitors that the data in the data area to be protected changes, the transformed version data is stored in the client, and meanwhile, the transformed version data in the CDP backup center is stored, so that bidirectional backup is realized;
When the CDP recovery center receives a data recovery request sent by a client, acquiring target version data corresponding to the data recovery request from the CDP backup center according to the data recovery request of the client, and transmitting the target version data to the client;
when the CDP proxy module backs up the data of each version to the CDP backup center, or the CDP recovery center responds to the data recovery request of the client, the network security protection is carried out through the security protection center.
The CDP-based data storage system and the CDP-based data storage method realize bidirectional data backup, can recover data of any version at any time, can acquire data of a corresponding version from a CDP backup center when client data is damaged, realize effective data backup, and can recover data more easily, and simultaneously realize network safety protection when the data is recovered from the CDP backup center, thereby realizing safe storage of the data.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.
Fig. 1 is a schematic structural diagram of a CDP-based data storage system according to an embodiment of the present invention.
Fig. 2 is a flowchart of a CDP-based data storage method according to an embodiment of the present invention.
Specific embodiments of the present invention have been shown by way of the above drawings and will be described in more detail below. The drawings and the written description are not intended to limit the scope of the inventive concepts in any way, but rather to illustrate the inventive concepts to those skilled in the art by reference to the specific embodiments.
Detailed Description
Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples do not represent all implementations consistent with the invention. Rather, they are merely examples of apparatus and methods consistent with aspects of the invention as detailed in the accompanying claims.
Embodiments of the present invention are described in detail below with reference to the accompanying drawings.
Example 1
As shown in fig. 1, a CDP (Customer Data Platform, persistent data protection) -based data storage system includes a CDP proxy module, a CDP backup center, a CDP restore center, and a security protection center, where the CDP proxy module is disposed on a client.
The CDP proxy module is used for configuring a data area to be protected on the client and protecting the data in the data area.
Optionally, before the connection is established, account information may be applied to the CDP backup center by the CDP proxy module, and after the CDP backup center assigns a unique ID (identity) and a password to the client, the user may recover data from the CDP backup center according to the unique ID and the password even after the computer is damaged or has a virus in the computer. When the client corresponding to the user runs normally, the data can be recovered from the client, and the efficiency is improved.
And the CDP backup center is used for carrying out secondary backup on the data area to be protected on the client so as to realize bidirectional data backup of the data.
And the CDP recovery center is used for acquiring target version data corresponding to the data recovery request from the CDP backup center according to the data recovery request of the client and transmitting the target version data to the client.
Optionally, the CDP backup center may record account information of the user and associated data, and when the user logs into the account at other devices, the data may also be restored directly from the CDP backup center.
And the safety protection center is used for carrying out network safety protection when the CDP proxy module backs up the data of each version to the CDP backup center or the CDP recovery center responds to the data recovery request of the client.
For example, network security protection can be network intrusion detection to detect whether data interaction is abnormal, and when the data interaction is abnormal, an abnormal prompt is sent to an administrator, so that the security of the data can be effectively ensured. In the prior art, a data backup system often adopts a regional network to operate, and does not have the functions of remote backup and remote recovery, so that the data security is ensured through network security protection during remote data storage.
In one possible implementation, the CDP proxy module includes a first configuration management unit, a file interception unit, a first inverse operation acquisition unit, and a first delta data acquisition unit.
The first configuration management unit is used for configuring the data area to be protected on the client and the IP address of the CDP backup center, and sending a data protection request to the CDP backup center according to the IP address of the CDP backup center, wherein the data protection request comprises the data area to be protected on the client.
The connection between the CDP proxy module and the CDP backup center can be established according to the IP address of the CDP backup center, and when the connection is established for the first time, the data area to be protected on the client can be completely backed up so as to realize data synchronization.
The file interception unit is used for intercepting data operation, target data before the operation and target data after the operation in a data area to be protected by the client, and synchronizing the data operation, the target data before the operation and the target data after the operation to the CDP backup center.
Optionally, the data operations in the data area to be protected may include operations of renaming, deleting, creating and modifying attributes, and the like, which are not described in detail herein. The operation of the data can be monitored in real time through the file interception unit, so that each version of the data is backed up, and a user can restore the data to any version.
And the first inverse operation acquisition unit is used for acquiring the inverse operation of the data operation when the data operation is not the write operation, generating a version number of the target data, and storing the inverse operation and the version number in association with the target data after the operation.
Alternatively, when the data is operated as deleted, the reverse operation is newly created, and the target data after the operation is empty, so that the deleted copy can be saved for subsequent restoration. When the data is written, the data only relates to the increment of the data, so that the difference calculation can be directly carried out, and the mutual conversion can be carried out only by replacing the difference data between two versions of the same data. When the data operation is other operations, the reverse operation can be directly recorded, and the current version is used as the basis, so that the reverse operation between the versions can be restored to any version.
And the first delta data acquisition unit is used for determining that the inverse operation of the data operation is a deletion operation when the data operation is a write operation, acquiring the first delta data according to the target data before the operation and the target data after the operation, generating a version number of the target data, and storing the deletion operation, the first delta data and the version number in association with the target data after the operation.
Alternatively, the forward operation between every two versions can be recorded, so that the user can restore the data to another version according to the data of any one version in the middle and the corresponding version number.
In one possible embodiment, acquiring the first delta data according to the target data before the operation and the target data after the operation includes:
and partitioning the target data before operation in a fixed-length partitioning or unequal-length partitioning mode to obtain N first data blocks, and sequentially numbering the first data blocks.
And partitioning the target data after the operation in a fixed-length partitioning or unequal-length partitioning mode to obtain N second data blocks, and sequentially numbering the second data blocks, wherein each first data block corresponds to one second data block.
And acquiring a first weak check value and a first MD5 check value corresponding to the first data block, and acquiring a second weak check value and a second MD5 check value corresponding to the second data block.
Judging whether the first weak check value of the first data block is the same as the second weak check value of the second data block corresponding to the first data block, if so, entering an MD5 check flow, otherwise, taking the second data block as first differential data.
Judging whether the first MD5 check value of the first data block is identical to the second MD5 check value of the second data block corresponding to the first data block, if so, the two data blocks are identical, otherwise, the second data block is used as first differential data.
All the first data blocks and the second data blocks are traversed to acquire first delta data between target data before operation and target data after operation.
Alternatively, in addition to checking the data block with the weak check value and the MD5 check value, other data block attributes may be used for checking, for example, the length of the data, the hash value of the data, and so on.
In one possible implementation, the CDP backup center includes a second configuration management unit, an operation synchronization unit, a second delta data acquisition unit, a second inverse operation acquisition unit, a storage unit, and a version management unit.
The second configuration management unit is used for receiving the data protection request sent by the first configuration management unit and establishing connection between the CDP proxy module and the CDP backup center according to the data area to be protected on the client side in the data protection request.
Optionally, the second configuration management unit may be further configured to assign an account and audit the account application sent by the CDP proxy module by the administrator.
And the operation synchronization unit is used for receiving the data operation transmitted by the file interception unit, the target data before the operation and the target data after the operation.
When the connection is established for the first time, the data in the data area to be protected in the CDP proxy module can be completely synchronized to the CDP backup center to serve as the basis for the storage of the follow-up data version.
And the second delta data acquisition unit is used for determining that the inverse operation of the data operation is a deletion operation when the data operation is a write operation, acquiring the second delta data according to the target data before the operation and the target data after the operation, generating a version number of the target data, and storing the deletion operation, the second delta data and the version number in the storage unit in association with the target data after the operation.
And the second inverse operation acquisition unit is used for acquiring the inverse operation of the data operation when the data operation is not the write operation, generating a version number of the target data, and storing the inverse operation and the version number in the storage unit in association with the target data after the operation.
And a storage unit for storing the data generated by the second delta data acquisition unit and the second inverse operation acquisition unit.
And the version management unit is used for judging whether the version number generated by the CDP proxy module is the same as the version number stored in the storage unit in the current backup, if so, ending the data backup, otherwise, synchronizing the version number.
The invention can restore the data of the version to the data of other versions by recording the difference and the operation between the versions, thus greatly reducing the occupation of the data storage space, realizing bidirectional data backup and effectively ensuring the safety of the data.
In one possible implementation, the CDP recovery center includes a request receiving unit, a request checking unit, and a data recovery unit.
The request receiving unit is used for receiving a data recovery request transmitted by the client, wherein the data recovery request comprises a target version number of data to be recovered and a current version number of the data to be recovered.
And the request checking unit is used for acquiring current data corresponding to the current version number, inverse operation between the target version number and the current version number and first differential data from the client, acquiring the current data corresponding to the current version number, inverse operation between the target version number and the current version number and second differential data from the storage unit, judging whether the inverse operation and the first differential data acquired from the client are identical to the inverse operation and the second differential data acquired from the storage unit, if so, generating a checking result to be checked and passing, scheduling the data recovery unit to recover the data, otherwise, generating a checking result to be checked and not passing, and scheduling the data recovery unit to recover the data.
And the data recovery unit is used for directly and remotely scheduling the current data in the client, the inverse operation between the target version number and the current version number and the first delta data when the verification result is that the verification is passed, and carrying out data recovery. And when the verification result is that the verification fails, directly and remotely scheduling the current data, the inverse operation between the target version number and the current version number and the second delta data in the storage unit, and carrying out data recovery.
Alternatively, the stored data on the CDP backup center is not allowed to be modified except for the administrator and synchronization operations, so the data on the CDP backup center can be considered unchanged. The CDP recovery center and the CDP backup center can be arranged on the same server, when a data recovery request of a client arrives, the data can be compared first to verify whether the data backed up in the client changes, and when the data changes, the data of the CDP backup center is used as the standard to ensure the accuracy of the data; when the data is not changed, the CDP proxy module can be remotely controlled to directly recover the data at the client, so that the accuracy of data recovery is ensured, the occupation of a network can be reduced, and the transmission of the data is quickened.
In one possible implementation, the security center includes a security model training unit and a network security unit.
The safety protection model training unit is used for acquiring historical flow data and abnormal recognition results corresponding to the historical flow data, wherein the historical flow data and the abnormal recognition results corresponding to the historical flow data are prestored data, constructing a safety protection model by adopting a convolutional neural network, and training the safety protection model by adopting the historical flow data and the abnormal recognition results corresponding to the historical flow data to obtain a trained safety protection model.
And the network safety protection unit is used for carrying out abnormal identification on the flow characteristics by adopting a trained safety protection model when the CDP proxy module backs up the data of each version to the CDP backup center or the CDP recovery center responds to the data recovery request of the client so as to obtain an abnormal identification result corresponding to the flow characteristics.
Optionally, other neural network models can be used as a safety protection model to realize the identification of network traffic.
In the prior art, the convolutional neural network is trained by a gradient descent method, so that the convolutional neural network is prone to being trapped into local optimum, and the recognition effect is poor.
In one possible implementation manner, training the safety protection model by using the historical flow data and the abnormal recognition result corresponding to the historical flow data to obtain a trained safety protection model, including:
initializing network parameters of the safety protection model by adopting a chaos sequence strategy to obtain parameter individuals, and obtaining a plurality of parameter individuals.
Taking the data normalized by the historical flow data as input data of the safety protection model, obtaining actual output data of the safety protection model, and taking an abnormal recognition result corresponding to the historical flow data as expected output data, and obtaining the fitness value of each parameter individual.
And taking the parameter individual with the maximum fitness value as the optimal individual, and carrying out global search on all the parameter individuals on the basis of the optimal individual to obtain the parameter individual after primary updating.
And further carrying out local search on the parameter individuals after primary updating to obtain parameter individuals after secondary updating.
And determining a final updated value of each parameter individual based on the parameter individual after primary updating and the parameter individual after secondary updating.
Based on the parameter individuals updated by the final updated values, judging whether the fitness value of the parameter individuals is larger than a set threshold or the current updating times T is larger than the maximum updating times T, if so, taking the parameter individuals with the largest fitness value as final network parameters of the safety protection model to obtain the safety protection model after training, otherwise, entering the next training.
Alternatively, when the training number reaches the maximum update number T, it is possible to search only the globally preferred location, but not the optimal point, so that the gradient descent method may be further used for optimization to determine the globally optimal location.
In one possible embodiment, the fitness value of each parameter individual is:
wherein ,indicating fitness values, i=1, 2, …, I indicating parameter individual total number, p=1, 2, …, P indicating historical flow data total number, k=1, 2, …, K indicating safety protection model output total number; />Represents the kth actual output when the p-th historical flow data is input, < >>Represents the kth expected output when the p-th historical traffic data is input.
In one possible implementation, the global search is specifically:
the global search is specifically:
wherein ,represents the ithUpdate frequency of parameter individual,/->Representing the upper limit of update frequency,/-, for>Representing the update frequency lower bound->Representing a random number between (0, 1); />Represents the update step size of the ith parameter individual at the time of the t-th training, and +.>Update after crossing boundary>;/>Representing the updated step size of the ith parameter individual at the time of t-1 training,representing inertial weights, ++>Represents the ith parameter individual of the t-1 th training,/th parameter individual of the t>Represents the best individual at training time t-1, < ->Indicating a preset maximum number of updates, < >>Representing updated->。
In one possible implementation, the parameter individuals after the secondary updating are:
wherein ,representing updated->,/>Represents update coefficients, and->Is [ -1,1 ]Constant between->Representing the average loudness of all parameter individuals in the present training, < ->Represents a random number between (0, 1, ">And the pulse transmitting frequency corresponding to the ith parameter individual in the t-1 training is shown.
The final updated value for each parameter individual is:
wherein ,representing the final updated value of the ith parameter individual in the current update,/for the current update>Indicating the loudness of the ith parameter individual at the time of the t-1 training, +.>Representing parameter individual->Is adapted to the value of->Representing the maximum fitness value corresponding to the parameter individual after the initial update,/for the parameter individual>Representing updated->Initial +.>0.95%>Represents the attenuation coefficient between (0, 1, ">Representing updated->,/>Greater than 0->Representing an increase factor>Representation ofAn initial pulse emission frequency of the ith parameter individual, and +.>。
Example 2
As shown in fig. 2, the present invention provides a CDP-based data storage method, including:
s1, establishing connection between the CDP proxy module and the CDP backup center, and completely backing up the data in the data area to be protected configured on the client side by the CDP proxy module to the CDP backup center.
And S2, when the CDP proxy module monitors that the data in the data area to be protected changes, the transformed version data is stored in the client, and meanwhile, the transformed version data in the CDP backup center is stored, so that bidirectional backup is realized.
And S3, when the CDP recovery center receives the data recovery request sent by the client, acquiring target version data corresponding to the data recovery request from the CDP backup center according to the data recovery request of the client, and transmitting the target version data to the client.
When the CDP proxy module backs up the data of each version to the CDP backup center, or the CDP recovery center responds to the data recovery request of the client, the network security protection is carried out through the security protection center.
The technical solution in this embodiment is similar to the principle and beneficial effects of the solution in embodiment 1, and will not be described here again.
The CDP-based data storage system provided by the invention realizes bidirectional data backup, can recover data of any version at any time, can acquire data of a corresponding version from a CDP backup center when the data of a client is damaged, realizes effective data backup, is easier to recover data, and also performs network security protection when the data is recovered from the CDP backup center, thereby realizing safe data storage.
Other embodiments of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This invention is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the invention being indicated by the following claims.
It is to be understood that the invention is not limited to the precise arrangements and instrumentalities shown in the drawings, which have been described above, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the invention is limited only by the appended claims.
Claims (10)
1. The CDP-based data storage system is characterized by comprising a CDP agent module, a CDP backup center, a CDP recovery center and a safety protection center, wherein the CDP agent module is arranged on a client;
the CDP proxy module is used for configuring a data area to be protected on the client and protecting data in the data area;
the CDP backup center is used for carrying out secondary backup on a data area to be protected on the client so as to realize bidirectional data backup of data;
the CDP recovery center is used for acquiring target version data corresponding to the data recovery request from the CDP backup center according to the data recovery request of the client and transmitting the target version data to the client;
the safety protection center is used for carrying out network safety protection when the CDP proxy module backs up the data of each version to the CDP backup center or the CDP recovery center responds to the data recovery request of the client.
2. The CDP-based data storage system of claim 1, wherein the CDP proxy module comprises a first configuration management unit, a file interception unit, a first inverse operation acquisition unit, and a first delta data acquisition unit;
the first configuration management unit is configured to configure a data area to be protected on the client and an IP address of the CDP backup center, and send a data protection request to the CDP backup center according to the IP address of the CDP backup center, where the data protection request includes the data area to be protected on the client;
the file interception unit is used for intercepting data operation, target data before the operation and target data after the operation in a data area to be protected by the client, and synchronizing the data operation, the target data before the operation and the target data after the operation to the CDP backup center;
the first inverse operation obtaining unit is configured to obtain an inverse operation of the data operation when the data operation is not a write operation, generate a version number of the target data, and store the inverse operation and the version number in association with the target data after the operation;
the first delta data obtaining unit is used for determining that the inverse operation of the data operation is a deleting operation when the data operation is a writing operation, obtaining the first delta data according to the target data before the operation and the target data after the operation, generating a version number of the target data, and storing the deleting operation, the first delta data and the version number in association with the target data after the operation.
3. The CDP-based data storage system of claim 2, wherein the obtaining the first delta data based on the target data before the operation and the target data after the operation comprises:
partitioning target data before operation in a fixed-length partitioning or unequal-length partitioning mode to obtain N first data blocks, and sequentially numbering the first data blocks;
partitioning the target data after the operation in a fixed-length partitioning or unequal-length partitioning mode to obtain N second data blocks, sequentially numbering the second data blocks, wherein each first data block corresponds to one second data block;
acquiring a first weak check value and a first MD5 check value corresponding to a first data block, and acquiring a second weak check value and a second MD5 check value corresponding to a second data block;
judging whether the first weak check value of the first data block is the same as the second weak check value of the second data block corresponding to the first data block, if so, entering an MD5 check flow, otherwise, taking the second data block as first differential data;
judging whether a first MD5 check value of a first data block is identical to a second MD5 check value of a second data block corresponding to the first data block, if so, the two data blocks are identical, otherwise, the second data block is used as first differential data;
All the first data blocks and the second data blocks are traversed to acquire first delta data between target data before operation and target data after operation.
4. The CDP-based data storage system of claim 3, wherein the CDP backup center includes a second configuration management unit, an operation synchronization unit, a second delta data acquisition unit, a second inverse operation acquisition unit, a storage unit, and a version management unit;
the second configuration management unit is used for receiving the data protection request sent by the first configuration management unit and establishing connection between the CDP proxy module and the CDP backup center according to the data area to be protected on the client side in the data protection request;
the operation synchronization unit is used for receiving the data operation transmitted by the file interception unit, the target data before the operation and the target data after the operation;
the second delta data obtaining unit is used for determining that the inverse operation of the data operation is a deleting operation when the data operation is a writing operation, obtaining second delta data according to target data before the operation and target data after the operation, generating a version number of the target data, and storing the deleting operation, the second delta data and the version number in a storage unit in association with the target data after the operation;
The second inverse operation obtaining unit is configured to obtain an inverse operation of the data operation when the data operation is not a write operation, generate a version number of the target data, and store the inverse operation and the version number in association with the target data after the operation in the storage unit;
the storage unit is used for storing the data generated by the second differential data acquisition unit and the second inverse operation acquisition unit;
and the version management unit is used for judging whether the version number generated by the CDP proxy module is the same as the version number stored in the storage unit in the current backup, if so, ending the data backup, otherwise, synchronizing the version number.
5. The CDP-based data storage system of claim 4, wherein the CDP recovery center includes a request receiving unit, a request checking unit, and a data recovery unit;
the request receiving unit is used for receiving a data recovery request transmitted by a client, wherein the data recovery request comprises a target version number of data to be recovered and a current version number of the data to be recovered;
the request checking unit is used for acquiring current data corresponding to the current version number, inverse operation between the target version number and the current version number and first delta data from the client, acquiring the current data corresponding to the current version number, inverse operation between the target version number and the current version number and second delta data from the storage unit, judging whether the inverse operation acquired from the client, the inverse operation acquired from the first delta data and the inverse operation acquired from the storage unit and the second delta data are the same, if yes, generating a checking result to be checked, scheduling the data recovery unit to recover the data, otherwise, generating a checking result to be checked, and scheduling the data recovery unit to recover the data;
The data recovery unit is used for directly and remotely scheduling the current data in the client, the inverse operation between the target version number and the current version number and the first delta data when the verification result is that the verification is passed, and carrying out data recovery; and when the verification result is that the verification fails, directly and remotely scheduling the current data, the inverse operation between the target version number and the current version number and the second delta data in the storage unit, and carrying out data recovery.
6. The CDP-based data storage system of claim 5, wherein the security center comprises a security model training unit and a network security unit;
the safety protection model training unit is used for acquiring historical flow data and abnormal recognition results corresponding to the historical flow data, wherein the historical flow data and the abnormal recognition results corresponding to the historical flow data are prestored data, constructing a safety protection model by adopting a convolutional neural network, and training the safety protection model by adopting the historical flow data and the abnormal recognition results corresponding to the historical flow data to obtain a trained safety protection model;
the network safety protection unit is used for carrying out abnormal identification on the flow characteristics by adopting a trained safety protection model when the CDP agent module backs up the data of each version to the CDP backup center or the CDP recovery center responds to the data recovery request of the client, so as to obtain an abnormal identification result corresponding to the flow characteristics.
7. The CDP-based data storage system of claim 6, wherein training the safety protection model using the historical traffic data and the anomaly recognition results corresponding to the historical traffic data to obtain a trained safety protection model comprises:
initializing network parameters of a safety protection model by adopting a chaos sequence strategy to obtain parameter individuals, and obtaining a plurality of parameter individuals;
taking the data normalized by the historical flow data as input data of a safety protection model, acquiring actual output data of the safety protection model, and taking an abnormal recognition result corresponding to the historical flow data as expected output data to acquire an adaptability value of each parameter individual;
taking the parameter individual with the maximum fitness value as the optimal individual, and carrying out global search on all the parameter individuals on the basis of the optimal individual to obtain the parameter individual after primary updating;
further carrying out local search on the parameter individuals after primary updating to obtain parameter individuals after secondary updating;
determining a final updated value of each parameter individual based on the parameter individual after primary updating and the parameter individual after secondary updating;
based on the parameter individuals updated by the final updated values, judging whether the fitness value of the parameter individuals is larger than a set threshold or the current updating times T is larger than the maximum updating times T, if so, taking the parameter individuals with the largest fitness value as final network parameters of the safety protection model to obtain the safety protection model after training, otherwise, entering the next training.
8. The CDP-based data storage system of claim 7 wherein the fitness value of each parameter individual is:
wherein ,indicating fitness values, i=1, 2, …, I indicating parameter individual total number, p=1, 2, …, P indicating historical flow data total number, k=1, 2, …, K indicating safety protection model output total number; />Represents the kth actual output when the p-th historical flow data is input, < >>Represents the kth expected output when the p-th historical traffic data is input;
the global search is specifically:
wherein ,indicating the update frequency of the ith parameter individual, < +.>Representing the upper limit of update frequency,/-, for>Representing the update frequency lower bound->Representing a random number between (0, 1); />Represents the update step size of the ith parameter individual at the time of the t-th training, and +.>Update after crossing boundary>;/>Representing the update step size of the ith parameter individual at the time of the t-1 th training, +.>Representing inertial weights, ++>Represents the ith parameter individual of the t-1 th training,/th parameter individual of the t>Represents the best individual at training time t-1, < ->Indicating a preset maximum number of updates, < >>Representing updated->。
9. The CDP-based data storage system of claim 8 wherein the secondarily updated individual parameters are:
wherein ,representing updated->,/>Represents update coefficients, and->Is [ -1,1]Constant between->Representing the average loudness of all parameter individuals in the present training, < ->Represents a random number between (0, 1, ">Representing the pulse transmitting frequency corresponding to the ith parameter individual in the t-1 th training;
the final updated value for each parameter individual is:
wherein ,representing the final updated value of the ith parameter individual in the current update,/for the current update>Indicating the loudness of the ith parameter individual at the time of the t-1 training, +.>Representing parameter individual->Is adapted to the value of->Representing the maximum fitness value corresponding to the parameter individual after the initial update,/for the parameter individual>Representing updated->Initial +.>0.95%>Represents the attenuation coefficient between (0, 1, ">Representing updated->,/>Greater than 0->Representing an increase factor>An initial pulse emission frequency representing the ith parameter individual, and +.>。
10. A CDP-based data storage method, comprising:
establishing connection between the CDP proxy module and the CDP backup center, and completely backing up the data in the data area to be protected configured on the client side by the CDP proxy module to the CDP backup center;
when the CDP proxy module monitors that the data in the data area to be protected changes, the transformed version data is stored in the client, and meanwhile, the transformed version data in the CDP backup center is stored, so that bidirectional backup is realized;
When the CDP recovery center receives a data recovery request sent by a client, acquiring target version data corresponding to the data recovery request from the CDP backup center according to the data recovery request of the client, and transmitting the target version data to the client;
when the CDP proxy module backs up the data of each version to the CDP backup center, or the CDP recovery center responds to the data recovery request of the client, the network security protection is carried out through the security protection center.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310593509.0A CN116301668B (en) | 2023-05-25 | 2023-05-25 | CDP-based data storage system and method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310593509.0A CN116301668B (en) | 2023-05-25 | 2023-05-25 | CDP-based data storage system and method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116301668A true CN116301668A (en) | 2023-06-23 |
CN116301668B CN116301668B (en) | 2023-08-04 |
Family
ID=86822575
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310593509.0A Active CN116301668B (en) | 2023-05-25 | 2023-05-25 | CDP-based data storage system and method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116301668B (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109710453A (en) * | 2017-10-26 | 2019-05-03 | 深圳市沃土科技有限公司 | A kind of continuous data store method |
WO2020227429A1 (en) * | 2019-05-06 | 2020-11-12 | Strong Force Iot Portfolio 2016, Llc | Platform for facilitating development of intelligence in an industrial internet of things system |
CN112436957A (en) * | 2020-11-03 | 2021-03-02 | 深圳市永达电子信息股份有限公司 | PDRR network security guarantee model parallel implementation system based on cloud computing |
-
2023
- 2023-05-25 CN CN202310593509.0A patent/CN116301668B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109710453A (en) * | 2017-10-26 | 2019-05-03 | 深圳市沃土科技有限公司 | A kind of continuous data store method |
WO2020227429A1 (en) * | 2019-05-06 | 2020-11-12 | Strong Force Iot Portfolio 2016, Llc | Platform for facilitating development of intelligence in an industrial internet of things system |
CN112436957A (en) * | 2020-11-03 | 2021-03-02 | 深圳市永达电子信息股份有限公司 | PDRR network security guarantee model parallel implementation system based on cloud computing |
Also Published As
Publication number | Publication date |
---|---|
CN116301668B (en) | 2023-08-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11113156B2 (en) | Automated ransomware identification and recovery | |
US10701096B1 (en) | Systems and methods for anomaly detection on core banking systems | |
AU2012395331B2 (en) | Method and apparatus for recovering data | |
CN106610854A (en) | Model update method and device | |
CN101025741A (en) | Database back up system and method | |
CN114422224B (en) | Threat information intelligent analysis method and system for attack tracing | |
CN110188103A (en) | Data account checking method, device, equipment and storage medium | |
CN105743732B (en) | Method and system for recording transmission path and distribution condition of local area network files | |
US20230195870A1 (en) | System for face authentication and method for face authentication | |
CN107294924A (en) | Detection method, the device and system of leak | |
US11349855B1 (en) | System and method for detecting encrypted ransom-type attacks | |
CN116301668B (en) | CDP-based data storage system and method | |
EP2372552B1 (en) | Automated relocation of in-use multi-site protected data storage | |
CN112506699A (en) | Data security backup method, equipment and system | |
KR101931683B1 (en) | Security patch system employing unidirectional data transmission apparatus and method of operating the same | |
CN102637169A (en) | Safe and practical method and system for database backup | |
CN114398635A (en) | Layered security federal learning method and device, electronic equipment and storage medium | |
CN110413691A (en) | Database backup method, restoration methods and device based on block chain | |
JP2003132019A (en) | Hindrance-monitoring method for computer system | |
KR20210027730A (en) | System and method for security of multimedia file and computer-readable recording medium | |
EP4332794A1 (en) | Anomaly detection before backup | |
CN116305071B (en) | Account password security system based on artificial intelligence | |
US20140047511A1 (en) | Network storage system and method thereof | |
CN117155960A (en) | Information management system and method | |
KR101735431B1 (en) | System and method for recovering of flight data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |