CN112464044A - File data block change information monitoring and management system and method thereof - Google Patents

File data block change information monitoring and management system and method thereof Download PDF

Info

Publication number
CN112464044A
CN112464044A CN202011433430.4A CN202011433430A CN112464044A CN 112464044 A CN112464044 A CN 112464044A CN 202011433430 A CN202011433430 A CN 202011433430A CN 112464044 A CN112464044 A CN 112464044A
Authority
CN
China
Prior art keywords
file
memory
directory
data
data block
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011433430.4A
Other languages
Chinese (zh)
Other versions
CN112464044B (en
Inventor
郑忠慧
高硕�
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Eisoo Information Technology Co Ltd
Original Assignee
Shanghai Eisoo Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Eisoo Information Technology Co Ltd filed Critical Shanghai Eisoo Information Technology Co Ltd
Priority to CN202011433430.4A priority Critical patent/CN112464044B/en
Publication of CN112464044A publication Critical patent/CN112464044A/en
Application granted granted Critical
Publication of CN112464044B publication Critical patent/CN112464044B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9014Indexing; Data structures therefor; Storage structures hash tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9027Trees
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • G06F9/5016Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Quality & Reliability (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to a file data block change information monitoring management system and a method thereof, wherein the system comprises a client and a driving unit, the client is connected with the driving unit through a CDP manager, the driving unit is respectively connected with an operating system data interface and a memory, and the CDP manager is used for realizing the transmission of data information between the client and the driving unit; the drive unit is used for capturing file data block change information from the operating system, storing the captured information into the memory and transmitting the captured information to the client through the CDP manager; the client is used for initiating data reading or catalog monitoring tasks, receiving information captured by the driving unit from the operating system and carrying out data backup processing. Compared with the prior art, the method and the device capture the change of the file data block through the file filtering driver, can track the specific information of the change of the file, and can realize the purpose of adapting to different database applications, thereby reducing the complexity of adapting to different database applications.

Description

File data block change information monitoring and management system and method thereof
Technical Field
The invention relates to the technical field of duplicate data management, in particular to a file data block change information monitoring and management system and a method thereof.
Background
In the current social informatization large environment, data is a basic source of all behaviors, and the importance of the data urges various technologies generated around the data, such as the traditional technologies of timing backup protection, copy data management and the like. The copy data management technology can reflect the value of data most, and can help a user to further improve the use value of the data and dig out effective information hidden in the data on the basis of finishing traditional timing backup protection of the data. By separating out complete copy data, the method can be applied to daily development, testing and other works, and can transfer inquiry, testing, analysis and the like to a non-production system under the condition of not influencing business, thereby quickly utilizing data information and being beneficial to enhancing the competitiveness of users in a big data era.
The current copy data management technology mainly includes two aspects, one is a protection technology of application data, such as: the data protection is realized by constructing a database of the business system and applying full and incremental data, and on the other hand, the captured business data is stored by a data storage technology, so that complete duplicate data can be provided to realize the utilization of data by a user.
In the aspect of data storage technology, the prior art can realize the utilization of data, but for the capture of application data, especially for database application, because of numerous database manufacturers at present, service data can only be acquired by adapting interfaces of different databases, the complexity of adaptation is increased, so that a user cannot quickly and conveniently acquire file data block change information, and data protection and utilization cannot be timely and reliably performed.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provide a file data block change information monitoring and management system and a method thereof.
The purpose of the invention can be realized by the following technical scheme: a file Data block change information monitoring and management system comprises a client and a drive unit, wherein the client is connected with the drive unit through a Continuous Data Protection (CDP) manager, the drive unit is respectively connected with an operating system Data interface and a memory, and the CDP manager is used for realizing Data information transmission between the client and the drive unit;
the drive unit is used for capturing file data block change information from an operating system, storing the captured information into a memory, and transmitting the captured information to a client through a CDP manager;
the client is used for initiating a data reading or catalog monitoring task, receiving information captured by the driving unit from the operating system and carrying out data backup processing.
Further, the drive unit includes a memory allocation module and a directory binary tree generation unit, which are respectively connected to the memory, the directory binary tree generation unit is further connected to the data interface of the operating system, the memory allocation module is configured to acquire a space for storing file data block change information from the memory, and the directory binary tree generation unit is configured to capture name, position, and size change data of the file data block from the operating system, and generate a corresponding directory binary tree, which is stored in the memory as the file data block change information.
A file data block change information monitoring and management method comprises the following steps:
s1, the client initiates a directory monitoring task and transmits the initiated directory monitoring task request to the drive unit through the CDP manager;
s2, after receiving the directory monitoring task request, the drive unit captures the change information of the corresponding file data block from the operating system through the data interface of the operating system, and stores the change information of the file data block into the memory;
s3, the client initiates a data reading task and transmits the initiated data reading task request to the drive unit through the CDP manager;
s4, the drive unit transmits the corresponding file data block change information in the memory to the CDP manager, the CDP manager extracts the first address data of the file data block change information and transmits the extracted first address data to the client;
and S5, according to the received first address data, the client completes the backup operation of the corresponding file data.
Further, the directory monitoring task request includes a file path to be traced.
Further, the step S2 specifically includes the following steps:
s21, after receiving the directory monitoring task request, the drive unit stores the file path to be traced in a memory in a binary tree mode, and then starts the tracing mode of the file path to be traced;
and S22, when IO under the path of the file to be traced is operated in the operating system, the drive unit calculates the hash value of the path of the file to be traced to store the hash value in the corresponding bitmap chain table to obtain the change information of the file data block, and stores the change information in the memory in a directory binary tree manner.
Furthermore, the bitmap linked list adopts a skip list structure to quickly insert and acquire bitmaps, and comprises a plurality of bitmap offsets and corresponding bitmap pointers, wherein the bitmap pointers jointly form a file name pointer.
Further, the step S22 specifically includes the following steps:
s221, when IO under the path of the file to be tracked is operated in the operating system, according to a preset hash table, the driving unit takes the full path name of the file to be tracked as input, and a corresponding hash value is obtained through calculation;
s222, according to the hash value obtained through calculation, the driving unit generates or updates a bitmap linked list;
s223, according to the bitmap linked list information, the drive unit combines the current and directory binary tree stored in the memory to update and generate a new directory binary tree;
and S224, applying for acquiring the allocated memory space from the memory by the driving unit by adopting a memory fragment processing mode, and storing the new binary directory tree into the corresponding memory space.
Further, the IO under the file path to be traced includes a file name modification, a start address of the data change, and a length of the data change.
Further, the specific process of memory fragmentation processing is as follows: the method comprises the steps of constructing a corresponding main array according to preset memory allocation space capacity, wherein each array entry node in the main array corresponds to a space block, each space block consists of an addressing head and a data memory, the space blocks jointly form an idle linked list, a pointer used for pointing to a first idle node in the idle linked list is further arranged in the main array, and when a driving unit applies for allocating memory space to the memory, the space blocks are allocated through the pointer.
Further, the specific structure of the binary directory tree includes a parent node, a left child node and a right child node, the left child node points to a child directory or a child file, and the right child node points to a sibling directory or a sibling file of the same level.
Compared with the prior art, the invention has the following advantages:
the method and the device have the advantages that a client initiates a read data or directory monitoring task, data interaction between the client and a driving unit is realized by combining a CDP manager, and the driving unit connected with an operating system data interface is arranged, so that file data block change information can be captured from the operating system, and a subsequent client can directly and quickly obtain specific information of file changes to complete corresponding data backup operation, so that the aim of adapting to different database applications is fulfilled.
In the process of storing the captured file data block change information into the memory by the drive unit, nodes can be quickly searched based on the bitmap chain table mode by combining the bitmap chain table mode, the memory fragment processing mode and the directory binary tree mode, idle small memories of the memory can be effectively managed based on the memory fragment processing mode, the time for applying and releasing the memories is shortened, the modified file path information can be reliably and conveniently stored based on the directory binary tree mode, the memory consumption space can be reduced, the storage reliability of the file data block change information is comprehensively improved, and the accurate and quick execution of the directory monitoring task is ensured.
Drawings
FIG. 1 is a schematic diagram of the system of the present invention;
FIG. 2 is a schematic flow diagram of the process of the present invention;
FIG. 3 is a diagram of a data structure according to the present invention;
FIG. 4 is a diagram illustrating a binary tree structure of an embodiment of a directory;
FIG. 5 is a diagram illustrating memory fragmentation in an embodiment;
the notation in the figure is: 1. client, 2, drive unit, 3, CDP manager.
Detailed Description
The invention is described in detail below with reference to the figures and specific embodiments.
Examples
As shown in fig. 1, a file data block change information monitoring and management system includes a client 1 and a drive unit 2, the client 1 is connected with the drive unit 2 through a CDP manager 3, the drive unit 2 is further connected with an operating system data interface and a memory, respectively, the client 1 and the CDP manager 3 belong to a user layer, the drive unit 2 belongs to a kernel layer, the drive unit 2 includes a memory allocation module and a directory binary tree generation unit, respectively connected with the memory, the directory binary tree generation unit is further connected with the operating system data interface, the memory allocation module is used to obtain a space for storing file data block change information from the memory, the directory binary tree generation unit is used to capture name, position and size change data of a file data block from the operating system, and generating a corresponding binary directory tree to be used as file data block change information to be stored in the memory.
The CDP manager 3 is mainly used for facilitating a user layer to perform data interaction with the drive unit 2 more simply and flexibly, and the CDP manager 3 is provided for the user layer in a library form and provides a reading/setting data interface, so that the application layer does not need to care about detailed reading of data and only executes processing operation of the data;
the drive unit 2 is used for capturing file data block change information from an operating system, storing the captured information into a memory, and transmitting the captured information to the client 1 through the CDP manager 3;
the client 1 is used for initiating a data reading or directory monitoring task, receiving information captured by the driving unit 2 from an operating system, and performing data backup processing.
The data interaction process between the client 1 and the driving unit 2 is shown as (i) - (ii) in fig. 1, and in the first step, the user layer first initiates data reading or initiates directory monitoring. And the second step, the CDP manager interacts with the drive unit to send a read data request. And thirdly, after the drive unit captures the file data block conversion information, returning the captured information to the CDP manager, and after receiving the captured information sent by the drive, extracting a corresponding first address by the CDP manager. And fourthly, the CDP manager returns the first address to the user layer, and the user layer finishes the backup operation of corresponding data after receiving the first address data returned by the CDP manager. Fifthly, the user layer sends the result of the backup completion to the CDP manager. Sixthly, the CDP manager sends the backup completion response message to the driving unit, and the driving unit continues to read data after receiving the correct response and then transmits the data to the user layer.
The system is applied to practice, and a specific file data block change information monitoring and management method is shown in fig. 2 and comprises the following steps:
s1, the client initiates a directory monitoring task and transmits the initiated directory monitoring task request to the drive unit through the CDP manager, wherein the directory monitoring task request comprises a file path to be traced;
s2, after receiving the directory monitoring task request, the drive unit captures and obtains the change information of the corresponding file data block from the operating system through the operating system data interface, and stores the change information of the file data block in the memory, specifically:
after receiving a directory monitoring task request, a drive unit firstly stores a file path to be tracked in a memory in a binary tree mode, and then starts a tracking mode of the file path to be tracked;
when IO under the path of the file to be tracked is operated in the operating system, the driving unit takes the full path name of the file to be tracked as input according to a preset hash table, and a corresponding hash value is obtained through calculation;
according to the hash value obtained by calculation, the driving unit generates or updates a bitmap linked list;
then according to the bitmap linked list information, the drive unit combines the current directory binary tree stored in the memory to update and generate a new directory binary tree;
finally, a memory fragment processing mode is adopted, the drive unit applies for obtaining the allocated memory space from the memory, and the new directory binary tree is stored into the corresponding memory space;
the bitmap linked list adopts a skip list structure to quickly insert and acquire a bitmap, and comprises a plurality of bitmap offsets and corresponding bitmap pointers, wherein the plurality of bitmap pointers jointly form a file name pointer;
IO under the file path to be tracked comprises file name modification, an initial address of data change and the length of the data change;
the specific process of memory fragmentation processing is as follows: constructing a corresponding main array according to preset memory allocation space capacity, wherein each array entry node in the main array corresponds to a space block, one space block consists of an addressing head and a data memory, the space blocks jointly form an idle linked list, a pointer for pointing to a first idle node in the idle linked list is further arranged in the main array, and when a drive unit applies for allocating a memory space to the memory, the space blocks are allocated through the pointer;
the specific structure of the binary directory tree comprises a father node, a left child node and a right child node, wherein the left child node points to a child directory or a child file, and the right child node points to a sibling directory or a sibling file of the same level;
s3, the client initiates a data reading task and transmits the initiated data reading task request to the drive unit through the CDP manager;
s4, the drive unit transmits the corresponding file data block change information in the memory to the CDP manager, the CDP manager extracts the first address data of the file data block change information and transmits the extracted first address data to the client;
and S5, according to the received first address data, the client completes the backup operation of the corresponding file data.
In the invention, the user layer sets the monitoring directory at the system starting stage, and when the monitored file data changes, the drive unit can capture the file data, including the file name, the initial address of the data change, the length of the data change and other information. As shown in fig. 3, the driving unit obtains the file name, converts the file name into a corresponding key value through a hash algorithm, and finds the corresponding hash table entry through the key value. In order to solve the situation of excessive file conflicts, a file entry is inserted in a form of jumping, the file entry comprises bitmap data of a file for identifying the position of a file change, in order to save the full path of the file and save the memory space, a binary directory tree structure (as shown in fig. 3) is adopted, each node of the binary directory tree represents the name of each layer of directory or file, each directory node has a left subtree and a right subtree, the left subtree represents a child file of the directory, the right subtree represents a peer file of the directory, in order to obtain the whole path corresponding to the file with O (1) time complexity, a parent directory pointer is added at each node for tracing back the whole path, so that the change position of the file and the whole path of the file are all recorded, the invention adopts the structure mode of the binary directory tree, and can record the file name when the file is changed, under the condition of a large number of small files, if the whole absolute path of a file is simply recorded, a large amount of memory is wasted, and the father directory of the file is repeatedly stored for many times due to the number of the child files, so that in order to avoid the wasting condition, the directory binary tree is designed to optimize the storage times of the father directory name. As shown in FIG. 4, under the root directory there are four files/x/xx/1. txt,/x/xx/2. txt,/x/xxx/1. txt,/x/xx/sxx/1. txt, x being at the root directory location, according to the first filename format, xx is a subdirectory of x, which is then placed on the left sub-tree, 1.txt is a subdirectory of xx, and 1.txt is placed on the left sub-tree of xx. According to the format of the second file, the root directory x finds that the file already exists by querying the binary directory tree, the subdirectory xx can find that the file also exists, and the 2.txt is not found by searching the subdirectory xx, then the 2.txt is inserted into the right subtree of the 1. txt. According to the format of the third file, if the subdirectory xxx of x cannot be found in the left sub-tree of x, then xxx is inserted into the left sub-tree of x, and similarly, 1.txt is inserted into the left sub-tree of xxx. The fourth file format storage form is shown in fig. 4 in the same way.
In addition, since the drive unit needs to store the file data block change information captured in real time into the memory, if a cache with a size of tens/tens of bytes is applied to the system every time, and if a large amount of applications are applied, the memory fragmentation is serious, which affects the system operation efficiency, the fragmentation processing scheme is specially designed to solve the problem, so that the small memory application can be completed with time complexity O (1), and the system efficiency is not substantially affected. In this embodiment, as shown in fig. 5, a single small space block is composed of 8+128 bytes, 8 bytes are occupied by an addressing head, and 128 bytes represent a small memory to be actually used, so that the large memory is split into the small memories and is strung into a manageable linked list, a pointer in the array item always points to a first idle node of a rear idle linked list, and the large memory can be allocated from the pointer position when the memory needs to be allocated without consuming time.
In this embodiment, in order to efficiently use the transmitted data without retransmission, a 16MB cache space is also designed in the drive unit, and when the user receives the 16MB data, the user performs processing by himself; before the user layer does not confirm, the data cached in 16MB in the kernel is always stored; if the user layer confirms that all the uploaded data are processed, the 16MB cache data can be deleted, and the 16MB cache is filled with new data; if the user layer exits halfway, the drive unit can recover the data in the 16MB cache under the condition of detecting that the application layer exits, and resend the data when the user layer restarts to read again, so that the process exiting exception of the user layer in the process of processing the data can be reliably processed.
In summary, in the data copy management, the protection of each database application needs to be specially adapted for the database application, but the storage mode of the file used in the bottom layer of the database application is utilized, so that the change of the database file is captured through the file filtering driver, an innovative scheme capable of adapting to an obstructed database application is realized, and the complexity of adapting to different database applications is reduced. The file filtering driver can track specific information of changes of the file, such as name modification of the file, modified position and modified size of the file, and the like, and the tracked data is sent to a storage execution copy snapshot of a user layer after being captured, so that data backup is completed, and protection and utilization of the data are realized.
The invention divides the file filter driver into an application layer and a kernel layer, the application layer firstly sets the database file path to be traced and the information of resource allocation, after the kernel layer receives the path to be traced, the traced path is stored in a tree structure in a binary tree mode, the tracing modes of the paths are started, when IO under the file path is operated, the change of the file can be automatically traced, and the capacity of rapidly inserting and obtaining the bitmap can be achieved by calculating the hash value of the file path and storing the hash value in the corresponding bitmap linked list. When the application layer needs to acquire the change of the file path, the change of the database file is acquired through interaction of the application layer module and the kernel module.
Therefore, the application of a certain database does not need to be independently adapted, and the file filtering driver can track the files on the bottom layer of the database, so that the universal effect is achieved.
It should be noted that the invention is based on the hash table, can be adapted to the scene of a large amount of small files, when the number of files reaches the level of ten million, the searching speed can be satisfied, and the key value of the hash table is calculated by taking the full path name of the file as the input;
the modified file path information can be stored by adopting a directory binary tree structure, a large amount of memory situations which are consumed by a large amount of changed files can be dealt with, nodes in the directory binary tree structure are divided into father nodes, left child nodes and right child nodes, each node is used for storing a directory name corresponding to a directory node, a right pointer of the node points to a sibling/brother node, a left pointer of the node points to a child file, and the sibling/brother node points to the father node, so that the whole path information can be quickly acquired.
Based on jump table structure, can be used for the fast node of looking for, look for the file name that corresponds under the directory fast promptly, there is the use in two aspects, the conflict node of hash table, need improve the speed of looking for the conflict linked list when file figure is too many, deposit and contain a large amount of files under the same directory, use jump table can accelerate the speed of looking for the file, bitmap wherein mainly is used for quick sign file to change the corresponding position of data, can save the file and change in a large number and lead to the too much condition of memory consumption simultaneously, represent the size change of the inside 4KB of file and set up the not dibit according to 1 bit.
The memory fragmentation structure is adopted to avoid the problem that the performance of the system is reduced due to the fact that a large number of cores apply for small memories, the memory fragmentation processing mainly adopts a chain table mode to manage the small idle memories, so that a large number of applications or releases of the memories can be completed in a very short time, and the time complexity is O (1).
In addition, in practical application, in combination with a multithreading technology, the embodiment adopts a read-write double-thread processing mode, so that the write operation is not influenced when a thread is read to operate a data table, and the change of data and the read data are recorded at the highest speed; in combination with a breakpoint resume mode, the method is used for processing the exception of process exit occurring in the process of processing data by the user layer, in order to efficiently utilize the transmitted data not to be retransmitted, a 16MB cache space is designed in the driver, when the user receives the 16MB data, the user processes the data, the data cached by the 16MB in the kernel is always stored before the user layer does not confirm, and if the user layer confirms that the uploaded data is completely processed, the 16MB cache data can be deleted, and the 16MB cache is filled with new data. If the user layer exits halfway, the driver will reclaim the data in the 16MB buffer memory if detecting the exit of the application layer, and resend the data when the user layer restarts reading again.

Claims (10)

1. A file data block change information monitoring and management system is characterized by comprising a client (1) and a driving unit (2), wherein the client (1) is connected with the driving unit (2) through a CDP manager (3), the driving unit (2) is respectively connected with an operating system data interface and a memory, and the CDP manager (3) is used for realizing the transmission of data information between the client (1) and the driving unit (2);
the drive unit (2) is used for capturing file data block change information from an operating system, storing the captured information into a memory, and transmitting the captured information to the client (1) through the CDP manager (3);
the client (1) is used for initiating a data reading or catalog monitoring task, receiving information captured by the driving unit (2) from an operating system, and performing data backup processing.
2. The system for monitoring and managing the change information of the file data block according to claim 1, wherein the drive unit (2) includes a memory allocation module and a directory binary tree generation unit, which are respectively connected to the memory, the directory binary tree generation unit is further connected to the data interface of the operating system, the memory allocation module is configured to obtain a space for storing the change information of the file data block from the memory, and the directory binary tree generation unit is configured to capture the name, the position, and the size change data of the file data block from the operating system and generate a corresponding directory binary tree to be stored in the memory as the change information of the file data block.
3. A file data block change information monitoring and managing method applying the file data block change information monitoring and managing system according to claim 1, characterized by comprising the steps of:
s1, the client initiates a directory monitoring task and transmits the initiated directory monitoring task request to the drive unit through the CDP manager;
s2, after receiving the directory monitoring task request, the drive unit captures the change information of the corresponding file data block from the operating system through the data interface of the operating system, and stores the change information of the file data block into the memory;
s3, the client initiates a data reading task and transmits the initiated data reading task request to the drive unit through the CDP manager;
s4, the drive unit transmits the corresponding file data block change information in the memory to the CDP manager, the CDP manager extracts the first address data of the file data block change information and transmits the extracted first address data to the client;
and S5, according to the received first address data, the client completes the backup operation of the corresponding file data.
4. The method according to claim 3, wherein the request of the directory monitoring task includes a file path to be traced.
5. The method for monitoring and managing the file data block change information according to claim 4, wherein the step S2 specifically includes the following steps:
s21, after receiving the directory monitoring task request, the drive unit stores the file path to be traced in a memory in a binary tree mode, and then starts the tracing mode of the file path to be traced;
and S22, when IO under the path of the file to be traced is operated in the operating system, the drive unit calculates the hash value of the path of the file to be traced to store the hash value in the corresponding bitmap chain table to obtain the change information of the file data block, and stores the change information in the memory in a directory binary tree manner.
6. The method as claimed in claim 5, wherein the bitmap linked list adopts a skip list structure to insert and obtain bitmaps quickly, the bitmap linked list includes a plurality of bitmap offsets and corresponding bitmap pointers, and the plurality of bitmap pointers together form a file name pointer.
7. The method for monitoring and managing the file data block change information according to claim 5, wherein the step S22 specifically includes the following steps:
s221, when IO under the path of the file to be tracked is operated in the operating system, according to a preset hash table, the driving unit takes the full path name of the file to be tracked as input, and a corresponding hash value is obtained through calculation;
s222, according to the hash value obtained through calculation, the driving unit generates or updates a bitmap linked list;
s223, according to the bitmap linked list information, the drive unit combines the current and directory binary tree stored in the memory to update and generate a new directory binary tree;
and S224, applying for acquiring the allocated memory space from the memory by the driving unit by adopting a memory fragment processing mode, and storing the new binary directory tree into the corresponding memory space.
8. The method according to claim 7, wherein the IO in the file path to be traced includes a file name modification, a start address of a data change, and a length of the data change.
9. The method according to claim 7, wherein the specific process of the memory fragmentation processing is as follows: the method comprises the steps of constructing a corresponding main array according to preset memory allocation space capacity, wherein each array entry node in the main array corresponds to a space block, each space block consists of an addressing head and a data memory, the space blocks jointly form an idle linked list, a pointer used for pointing to a first idle node in the idle linked list is further arranged in the main array, and when a driving unit applies for allocating memory space to the memory, the space blocks are allocated through the pointer.
10. The method according to claim 7, wherein the binary directory tree has a specific structure including a parent node, a left child node and a right child node, the left child node points to a child directory or a child file, and the right child node points to a sibling directory or a sibling file in the same level.
CN202011433430.4A 2020-12-09 2020-12-09 File data block change information monitoring and management system and method thereof Active CN112464044B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011433430.4A CN112464044B (en) 2020-12-09 2020-12-09 File data block change information monitoring and management system and method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011433430.4A CN112464044B (en) 2020-12-09 2020-12-09 File data block change information monitoring and management system and method thereof

Publications (2)

Publication Number Publication Date
CN112464044A true CN112464044A (en) 2021-03-09
CN112464044B CN112464044B (en) 2023-04-07

Family

ID=74801078

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011433430.4A Active CN112464044B (en) 2020-12-09 2020-12-09 File data block change information monitoring and management system and method thereof

Country Status (1)

Country Link
CN (1) CN112464044B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114003562A (en) * 2021-12-29 2022-02-01 苏州浪潮智能科技有限公司 Directory traversal method, device and equipment and readable storage medium
CN115016988A (en) * 2022-08-08 2022-09-06 四川大学 CDP backup recovery method, system and storage medium based on binary tree log

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090313503A1 (en) * 2004-06-01 2009-12-17 Rajeev Atluri Systems and methods of event driven recovery management
US7650533B1 (en) * 2006-04-20 2010-01-19 Netapp, Inc. Method and system for performing a restoration in a continuous data protection system
CN101751474A (en) * 2010-01-19 2010-06-23 山东高效能服务器和存储研究院 Continuous data protection method based on centralized storage
CN101777016A (en) * 2010-02-08 2010-07-14 北京同有飞骥科技有限公司 Snapshot storage and data recovery method of continuous data protection system
CN101833489A (en) * 2010-05-06 2010-09-15 北京邮电大学 Method for file real-time monitoring and intelligent backup
CN104407940A (en) * 2014-11-26 2015-03-11 上海爱数软件有限公司 Method for quickly recovering CDP system
CN105389230A (en) * 2015-10-21 2016-03-09 上海爱数信息技术股份有限公司 Continuous data protection system and method combining with snapshot technology
CN105550062A (en) * 2015-12-03 2016-05-04 上海爱数信息技术股份有限公司 Continuous data protection and time point browse recovery based data backflow method
CN107340971A (en) * 2016-04-28 2017-11-10 上海优刻得信息科技有限公司 A kind of data storage is with recovering framework and method
CN107885616A (en) * 2017-09-29 2018-04-06 上海爱数信息技术股份有限公司 A kind of mass small documents back-up restoring method based on file system parsing
CN110674502A (en) * 2019-09-19 2020-01-10 华为技术有限公司 Data detection method and device

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090313503A1 (en) * 2004-06-01 2009-12-17 Rajeev Atluri Systems and methods of event driven recovery management
US7650533B1 (en) * 2006-04-20 2010-01-19 Netapp, Inc. Method and system for performing a restoration in a continuous data protection system
CN101751474A (en) * 2010-01-19 2010-06-23 山东高效能服务器和存储研究院 Continuous data protection method based on centralized storage
CN101777016A (en) * 2010-02-08 2010-07-14 北京同有飞骥科技有限公司 Snapshot storage and data recovery method of continuous data protection system
CN101833489A (en) * 2010-05-06 2010-09-15 北京邮电大学 Method for file real-time monitoring and intelligent backup
CN104407940A (en) * 2014-11-26 2015-03-11 上海爱数软件有限公司 Method for quickly recovering CDP system
CN105389230A (en) * 2015-10-21 2016-03-09 上海爱数信息技术股份有限公司 Continuous data protection system and method combining with snapshot technology
CN105550062A (en) * 2015-12-03 2016-05-04 上海爱数信息技术股份有限公司 Continuous data protection and time point browse recovery based data backflow method
CN107340971A (en) * 2016-04-28 2017-11-10 上海优刻得信息科技有限公司 A kind of data storage is with recovering framework and method
CN107885616A (en) * 2017-09-29 2018-04-06 上海爱数信息技术股份有限公司 A kind of mass small documents back-up restoring method based on file system parsing
CN110674502A (en) * 2019-09-19 2020-01-10 华为技术有限公司 Data detection method and device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
李毅飞等: "一种基于平衡二叉树的CDP数据备份及重构方法", 《数据通信》 *
李虓等: "一种连续数据保护系统的快照方法", 《软件学报》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114003562A (en) * 2021-12-29 2022-02-01 苏州浪潮智能科技有限公司 Directory traversal method, device and equipment and readable storage medium
CN114003562B (en) * 2021-12-29 2022-03-22 苏州浪潮智能科技有限公司 Directory traversal method, device and equipment and readable storage medium
CN115016988A (en) * 2022-08-08 2022-09-06 四川大学 CDP backup recovery method, system and storage medium based on binary tree log

Also Published As

Publication number Publication date
CN112464044B (en) 2023-04-07

Similar Documents

Publication Publication Date Title
CN104040481B (en) Method and system for merging, storing and retrieving incremental backup data
CN102541968B (en) Indexing method
US20140344222A1 (en) Method and apparatus for replication size estimation and progress monitoring
JP2005276192A (en) Method and apparatus for increasing data storage capacity
CN112464044B (en) File data block change information monitoring and management system and method thereof
CN109710614A (en) A kind of method and device of real-time data memory and inquiry
CN1818877A (en) Method, system and article of manufacture for metadata replication and restoration
CN103595797B (en) Caching method for distributed storage system
JPWO2011108695A1 (en) Parallel data processing system, parallel data processing method and program
US9438672B2 (en) Method for client specific database change propagation
WO2012083754A1 (en) Method and device for processing dirty data
US20190278854A1 (en) Methods and systems for resilient, durable, scalable, and consistent distributed timeline data store
CN111596922A (en) Method for realizing custom cache annotation based on redis
WO2021012932A1 (en) Transaction rollback method and device, database, system, and computer storage medium
US10642530B2 (en) Global occupancy aggregator for global garbage collection scheduling
CN1858710A (en) Method and system for synchronizing data
US20190199794A1 (en) Efficient replication of changes to a byte-addressable persistent memory over a network
CN113760847A (en) Log data processing method, device, equipment and storage medium
KR20110046118A (en) Adaptive logging apparatus and method
US10430341B2 (en) Log-structured storage method and server
CN113377292A (en) Single machine storage engine
CN107798063A (en) Snap processing method and snapshot processing unit
CN110352410A (en) Track the access module and preextraction index node of index node
JP2007287147A (en) Fast file attribute search
CN107329695B (en) Distributed storage memory management method, system and computer storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant