CN116610636A - Data processing method and device of file system, electronic equipment and storage medium - Google Patents

Data processing method and device of file system, electronic equipment and storage medium Download PDF

Info

Publication number
CN116610636A
CN116610636A CN202310599874.2A CN202310599874A CN116610636A CN 116610636 A CN116610636 A CN 116610636A CN 202310599874 A CN202310599874 A CN 202310599874A CN 116610636 A CN116610636 A CN 116610636A
Authority
CN
China
Prior art keywords
data
space
tree
data processing
binary tree
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310599874.2A
Other languages
Chinese (zh)
Inventor
请求不公布姓名
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anchao Cloud Software Co Ltd
Original Assignee
Anchao Cloud Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anchao Cloud Software Co Ltd filed Critical Anchao Cloud Software Co Ltd
Priority to CN202310599874.2A priority Critical patent/CN116610636A/en
Publication of CN116610636A publication Critical patent/CN116610636A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/172Caching, prefetching or hoarding of files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/1727Details of free space management performed by the file system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/1805Append-only file systems, e.g. using logs or journals to store data
    • G06F16/1815Journaling file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0614Improving the reliability of storage systems
    • G06F3/0616Improving the reliability of storage systems in relation to life time, e.g. increasing Mean Time Between Failures [MTBF]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/0643Management of files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/0644Management of space entities, e.g. partitions, extents, pools
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a data processing method of a file system. The method comprises the steps of receiving and responding to a writing operation request, selecting a log space based on a binary tree index and writing corresponding data; judging whether a preset condition is met or not; in response to a preset condition being met, the binary tree index is converted to a b+ tree index for indexing the data space to convert the log space to the data space. The invention also discloses a device, electronic equipment and a storage medium for realizing the data processing method. The invention can improve the read-write performance of the file system and prolong the service life of the solid state disk.

Description

Data processing method and device of file system, electronic equipment and storage medium
Technical Field
The present invention relates to the field of storage technologies, and in particular, to a data processing method of a file system, and an apparatus, an electronic device, and a storage medium for implementing the data processing method.
Background
B+ trees are currently the most commonly used index structure, and are widely used in index structures for many storage systems due to their efficient insertion, lookup, and modification capabilities. The B+ tree comprises a root node, an internal node and leaf nodes, all the leaf nodes are in the same level, and each node can have at most m child nodes except the root node, at least m/2 child nodes can be provided, each node can contain at most m-1 key values, at least (m/2) -1 key values, and m is the order of the B+ tree. When searching for data, starting from the root node, the data k to be searched for is compared with the keys [ k1, k2, k3, … km-1] on the root node. If k < k1, go to the left child node of the root node. Otherwise, if k= =k1, k2 is compared. If k < k2, then k is located between k1 and k2. Therefore, the left sub-level of k2 is searched. If k > k2, then k3, k4, … km-1 are performed as steps 2 and 3. Repeating the steps until the leaf node is reached. If k exists in the leaf node, return True, otherwise return False.
Currently, nodes of the b+ tree need to be persisted to disk (disk) after being updated in memory to ensure the integrity of the data. If the node to be modified (BNODE) is not in memory, it needs to be modified after being loaded from disk into memory. In a memory system without battery protection, a power loss will result in a data loss when the update of the b+ tree in memory is not persisted to disk. But if the B + tree is updated every time the data is updated and returned after waiting for its persistence, the performance of the system is severely affected. Persistence herein refers to a mechanism by which program data is transitioned between a persistent state and an instantaneous state. Persistence enables data (e.g., objects in memory) to be saved to a permanently-storable storage device (e.g., disk). The primary application of persistence is to store objects in memory in a database, or in a disk file, an XML data file, etc.
In order to avoid performance problems caused by updating the B+ tree every time data is updated, a log (Journal) is used for persistence of a Key value (Key) which needs to be changed, that is to say, the log is used for persistence of the Key value which needs to be changed, so that frequent persistence node data of the B+ tree can be avoided. In a specific implementation, the key value is updated by storing in the log memory, and persisting at intervals or after the log Buffer (Buffer) is full, that is, the data is written into the log layer (e.g. stored in the log memory) first, and then the data is written into the data layer (e.g. stored in the disk) from the log layer. Although this approach can improve performance and reduce the number of updates to the b+ tree, if power is suddenly lost or otherwise abnormal during this process, the log cache data is lost, which results in loss of user data. In order to ensure that the data is not lost, the log cache is subjected to persistence every time the key value is updated, but the problems of performance degradation, obvious log data writing amplification, disk service life reduction and the like are caused. In addition, although the persistence of the log reduces the number of times of persistence of the node of the B+ tree, in order to ensure the accessibility of the write data, the index structure of the B+ tree in the memory needs to be updated, so that a series of operations such as reading the persistence data from the disk, inserting the B+ and splitting exist, and the performance of the write IO is affected.
In order to improve the read-write performance of the system, the problems of log data write amplification, performance loss, solid state disk service life reduction and the like caused by ensuring data consistency are reduced. In the implementation process, a large-block IO is usually avoided (Bypass) log layer, so that the problem of write amplification is reduced as much as possible, but the problem still has some problems, and more complexity is brought to snapshot of a file system, implementation of cloning and the like, mainly because IO data avoiding the log layer needs to be considered. The large block IO here refers to a larger number of consecutive sectors of a read/write IO operation.
In summary, the existing method for updating the b+ tree by using the log persistence method has the above drawbacks, so that the problem caused by updating the b+ tree by using the log persistence method in the prior art needs to be solved.
The information disclosed in this background section is only for enhancement of understanding of the general background of the invention and should not be taken as an acknowledgement or any form of suggestion that this information forms the prior art already known to a person of ordinary skill in the art.
Disclosure of Invention
The invention aims to provide a data processing method of a file system, which can improve the read-write performance of the file system and prolong the service life of a solid state disk.
The invention also aims to provide a data processing device of the file system, which can realize the data processing method, improve the read-write performance of the file system and prolong the service life of the solid state disk.
The invention also aims to provide the electronic equipment, which can realize the data processing method, improve the read-write performance of a file system and prolong the service life of the solid state disk.
The invention also aims to provide a computer readable storage medium which can realize the data processing method, improve the read-write performance of a file system and prolong the service life of a solid state disk.
To achieve the above object, an embodiment of the present invention provides a data processing method of a file system, including:
receiving and responding to the writing operation request, selecting a log space based on a binary tree index and writing corresponding data;
judging whether a preset condition is met or not;
in response to a preset condition being met, the binary tree index is converted to a b+ tree index for indexing the data space to convert the log space to the data space.
In one or more embodiments of the invention, each node of the binary tree stores file block information for recording file modification information including an offset location of the modified file, a length of the modification, a segment number in which the modification data is stored, and an offset location of the modification data in the segment.
In one or more embodiments of the present invention, the preset condition is selected from one of a log space is full, and a preset time has elapsed after a write operation.
In one or more embodiments of the invention, the binary tree is an AVL tree.
In one or more embodiments of the present invention, the converting the binary tree index to a b+ tree index includes:
inserting a node of the binary tree into a node of the b+ tree;
judging whether the number of key values in the nodes of the current B+ tree exceeds a number limit;
in response to the number limit being exceeded, the current node is split.
In one or more embodiments of the invention, converting the binary tree index to a b+ tree index further comprises:
each node of the binary tree is also ordered before inserting the node of the binary tree into the node of the B + tree.
In one or more embodiments of the invention, each node of the binary tree is ordered by offset position of the modified file.
In one or more embodiments of the present invention, the data processing method further includes:
receiving and responding to the read operation request, and searching a corresponding log space through a binary tree;
judging whether corresponding data exists in the log space or not;
in response to the log space not having data, the data is retrieved by looking up the data space through the B+ tree.
In one or more embodiments of the present invention, the data processing method further includes:
recording segments used as log space;
after converting the log space to the data space, the record is purged.
An embodiment of the present invention provides a data processing apparatus of a file system, including:
the storage module is used for receiving and responding to the writing operation request, selecting a log space based on a binary tree index and writing corresponding data;
the judging module is used for judging whether preset conditions are met or not;
and the conversion module is used for converting the binary tree index into a B+ tree index for indexing the data space in response to the satisfaction of the preset condition so as to convert the log space into the data space.
An embodiment of the present invention provides an electronic device including:
at least one processor; and
at least one memory coupled to the at least one processor and storing a computer program for execution by the at least one processor, which when executed by the at least one processor, causes the electronic device to perform the method described above.
Embodiments of the present invention provide a computer readable storage medium having stored thereon a computer program which when executed by a machine implements the method described above.
Compared with the prior art, the binary tree index is converted into the B+ tree index so as to directly convert the log space into the data space, so that the operation steps that data is written into the log space first and then is read from the log space and written into the data space can be effectively avoided, the writing times of the solid state disk are reduced, and the service life of the solid state disk is prolonged. On the other hand, a series of operations such as reading the durable data, inserting the B+ tree, splitting and the like can be avoided, and the writing operation performance of the file system is improved.
Drawings
FIG. 1 is a flow chart of a method of data processing of a file system according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a solid state disk according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a binary tree transition B+ tree in accordance with an embodiment of the present invention;
FIG. 4 is a block diagram of a data processing apparatus of a file system according to an embodiment of the present invention.
Detailed Description
The following detailed description of embodiments of the invention is, therefore, to be taken in conjunction with the accompanying drawings, and it is to be understood that the scope of the invention is not limited to the specific embodiments.
Throughout the specification and claims, unless explicitly stated otherwise, the term "comprise" or variations thereof such as "comprises" or "comprising", etc. will be understood to include the stated element or component without excluding other elements or components.
The data processing method of the file system can solve the problems of performance loss, service life reduction of the solid state disk and the like caused by updating the B+ tree by using a log persistence mode in the prior art, that is, the data processing method can improve the read-write performance of the file system and the service life of the solid state disk.
As shown in fig. 1, according to a data processing method of a file system of a preferred embodiment of the present invention, data is written into a log space at first at the time of a write operation, and then the log space is directly converted into a data space when a preset condition is satisfied. Through space conversion, the problems of performance loss, service life reduction of a solid state disk and the like caused by updating the B+ tree by using a log persistence mode in the prior art can be avoided. Whereas the log space conversion herein is accomplished by converting the binary tree of the index log space into a b+ tree of the index data space.
Specifically, as shown in fig. 1, a data processing method of a file system includes the following steps:
s10, receiving and responding to a writing operation request, and selecting a log space based on a binary tree index to write corresponding data;
specifically, as shown in fig. 2, in the solid state disk, the storage space of the solid state disk is divided into a plurality of consecutive segments (segments), and the management of the storage space is performed in units of segments. For ease of management, the size of each segment herein is typically set to a fixed size, e.g., 1MB in size. Each segment may be further divided into a plurality of blocks of the same size, e.g., each block is 4K in size. These blocks record which are free and which are used using a data structure such as a Bitmap (Bitmap). When writing is actually performed, an idle segment or an idle space in the segment is usually found out from the remaining space of the solid state disk to store data.
Further, in the solid state disk, a certain number of segments (segments) are selected as log spaces to store data, and at the same time, a certain number of segments are selected as data spaces. The number of log spaces and data spaces here is dynamically changing, i.e. the log space can be converted into a data space. In this embodiment, the log space and the data space are configured in the same solid state disk, and in other embodiments, the log space and the data space may be configured separately, for example, the log space may be configured in one solid state disk, the data space may be configured in another solid state disk, and so on.
Further, the log space is based on a binary tree index, and data stored in the log space can be acquired through the binary tree index. The binary tree herein is preferably a balanced binary tree (AVL tree). Each node of the binary tree may record multiple file block information, such as 1024 file block information. Here, each file block information records modification information for a file. The file modification information here includes an Offset position (Offset) of the modified file, a length of the modification, a segment number (SegmentID) in which the modification data is stored, and an Offset position (Offset) of the modification data in the segment. The data space is based on a B+ tree index, and data stored in the data space can be acquired through the B+ tree index. The B+ tree comprises a root node, an internal node and leaf nodes, all the leaf nodes are in the same level, and each node can have at most m child nodes except the root node, at least m/2 child nodes can be provided, each node can contain at most m-1 key values, at least (m/2) -1 key values, and m is the order of the B+ tree. Leaf nodes of the b+ tree store data records, such as file block information, each of which records modification information for a file. The file modification information here includes an Offset position (Offset) of the modified file, a length of the modification, a segment number (SegmentID) in which the modification data is stored, and an Offset position (Offset) of the modification data in the segment.
Further, after receiving the write operation request, the file system further responds to the write operation request and writes the data corresponding to the write operation request into the idle log space. After writing the data into the corresponding log space, the binary tree may be updated such that the binary tree may record modification information for the file to index the corresponding data using the binary tree.
S20, judging whether a preset condition is met;
s30, converting the binary tree index into a B+ tree index in response to the preset condition being met, so as to convert the log space into a data space based on the B+ tree index.
Specifically, after the data corresponding to the write operation request is stored in the log space, whether a preset condition is met is further judged, so that subsequent operations can be conveniently executed. The preset condition here includes one of that the log space is full and a preset time elapses after the writing operation, however, in other embodiments, the preset condition may be set according to actual requirements. In this embodiment, after storing the data corresponding to the write operation request in the log space, it is further determined whether the log space is full, or whether a preset time has elapsed after the write operation.
And after determining that the preset condition is met, further converting the binary tree index into a B+ tree index so as to realize the direct conversion of the log space into the data space. Through converting the binary tree index into the B+ tree index, so that the log space is directly converted into the data space, the operation steps that data is written into the log space first and then is read from the log space and written into the data space can be effectively avoided, the writing times of the solid state disk are reduced, and the service life of the solid state disk is prolonged. Meanwhile, by converting the binary tree index into the B+ tree index so as to directly convert the log space into the data space, a series of operations of reading the durable data, inserting the B+ tree, splitting and the like can be avoided, and the writing operation performance of the file system is improved.
In this embodiment, in order to enable recovery of the log, which segments are recorded as log space. These records need to be purged after the log space is converted to the data space.
As shown in fig. 3, in the following, how to convert between the binary tree and the b+ tree will be described in detail, taking an example that each node of the binary tree stores one piece of file block information (e.g., an offset position of a modified file) and each node of the b+ tree stores one piece of file block information (e.g., an offset position of a modified file).
Specifically, when converting a binary tree with a b+ tree, each node of the binary tree is first ordered. In this embodiment, when each node is ordered, the ordering may be performed according to the offset position of the modified file recorded by each node. After each node of the binary tree is ordered, the node of the binary tree is further inserted into the B+ tree, and when the number of key values corresponding to leaf nodes of the B+ tree exceeds a limit, the node is split.
The process of transforming a binary tree containing three nodes into a B + tree is shown in fig. 3. In the binary tree, modifications of 3 file blocks are recorded, with corresponding file offset positions of 20, 50, 300, respectively. The b+ tree records modifications of 10 file blocks, corresponding to file offset positions of 10, 30, 60, 70, 80, 100, 400, 500, 700, 800, respectively.
In performing the insert operation, the insert operation of node 20 is performed first. When performing node 20 insert operations, the leftmost node insert 20 of the b+ tree is found. Thereafter, the nodes 50 and 300 are inserted again in sequence. And when the node is inserted every time, further judging whether the number of the current node key values exceeds a limit, for example, judging whether the number of the current node key values is smaller than or equal to m-1 and smaller than or equal to m-1, wherein m is the order of the B+ number. When the number of the current node key values does not exceed the limit, no splitting processing is performed. And when the number of the current node key values exceeds the limit, splitting is performed. If the number of the key values of the current node is less than or equal to m-1, no splitting treatment is performed. And when the number of the key values of the current node is larger than m-1, splitting is carried out.
When the binary tree is converted into the B+ tree, the data in the corresponding log space can be directly indexed by the B+ tree, that is, the log space corresponding to the binary tree index is used as the data space of the B+ tree, so that the conversion between the log space and the data space is realized. After the binary tree is converted into the B+ number, the nodes in the binary tree can be further deleted.
Further, the data processing method of the file system of the present invention further includes the following steps:
and receiving and responding to the read operation request, and searching the corresponding log space through a binary tree. And after the corresponding log space is found, further judging whether corresponding data exist in the log space. When the corresponding data does not exist or part of the corresponding data exists in the log space, the data space is further searched through the B+ tree so as to read the corresponding data. When corresponding data exists in the log space, the corresponding data is directly read.
Further, in order to ensure data consistency and reduce complexity of the file system snapshot, for the file system snapshot, index data of snapshot points may be recorded in a data log layer through an internal event (CheckPoint) manner. Since there is no problem of the IO Bypass (Bypass) log layer, the consistency of snapshot data can be ensured by recording snapshot data by means of an internal event (CheckPoint).
As shown in fig. 4, the data processing device for a file system according to an embodiment of the present invention can implement the above-mentioned data processing method, improve the read-write performance of the file system, and increase the service life of the solid state disk.
The data processing device comprises a storage module, a judging module and a converting module. The storage module is used for receiving and responding to the writing operation request, selecting a log space based on a binary tree index and writing corresponding data; the judging module is used for judging whether preset conditions are met; the conversion module is used for converting the binary tree index into a B+ tree index in response to the preset condition being met, so that the log space is converted into a data space based on the B+ tree index.
In the implementation, the file system receives a write operation request through the storage module and responds to the write operation request, and further writes data corresponding to the write operation request into the idle log space. After writing the data into the corresponding log space, the binary tree may be updated such that the binary tree may record modification information for the file to index the corresponding data using the binary tree. The log space is realized by selecting a certain number of segments (segments) from the solid state disk, and the log space is based on a binary tree index through which the data stored in the log space can be acquired. The binary tree herein is preferably a balanced binary tree (AVL tree). Each node of the binary tree may record multiple file block information, such as 1024 file block information. Here, each file block information records modification information for a file. The file modification information here includes an Offset position (Offset) of the modified file, a length of the modification, a segment number (SegmentID) in which the modification data is stored, and an Offset position (Offset) of the modification data in the segment.
After the data corresponding to the write operation request is stored in the log space, whether the preset condition is met or not is further judged through the judging module, and then the subsequent operation is executed. The preset condition here includes one of that the log space is full and a preset time elapses after the writing operation, however, in other embodiments, the preset condition may be set according to actual requirements. In this embodiment, after storing the data corresponding to the write operation request in the log space, it is further determined whether the log space is full, or whether a preset time has elapsed after the write operation.
And after the judgment module determines that the preset condition is met, the binary tree index is further converted into the B+ tree index through the conversion module, so that the log space is directly converted into the data space. The data space is realized by selecting a certain number of segments from the solid state disk, and the data space is based on the B+ tree index, and the data stored in the data space can be acquired through the B+ tree index. The B+ tree comprises a root node, an internal node and leaf nodes, all the leaf nodes are in the same level, and each node can have at most m child nodes except the root node, at least m/2 child nodes can be provided, each node can contain at most m-1 key values, at least (m/2) -1 key values, and m is the order of the B+ tree. Leaf nodes of the b+ tree store data records, such as file block information, each of which records modification information for a file. The file modification information here includes an Offset position (Offset) of the modified file, a length of the modification, a segment number (SegmentID) in which the modification data is stored, and an Offset position (Offset) of the modification data in the segment.
The binary tree index is converted into the B+ tree index so as to be convenient for directly converting the log space into the data space, so that the situation that data is written into the log space first can be effectively avoided, the writing times of the solid state disk are reduced when the data is read from the log space and sucked into the data space, and the service life of the solid state disk is prolonged. Meanwhile, by converting the binary tree index into the B+ tree index so as to directly convert the log space into the data space, a series of operations of reading the durable data, inserting the B+ tree, splitting and the like can be avoided, and the writing operation performance of the file system is improved.
As shown in fig. 3, in the following, how the conversion module converts the binary tree and the b+ tree will be described in detail, taking an example that each node of the binary tree stores one piece of file block information (e.g., an offset position of a modified file) and each node of the b+ tree stores one piece of file block information (e.g., an offset position of a modified file).
When converting the binary tree with the b+ tree, each node of the binary tree is first ordered. In this embodiment, when each node is ordered, the ordering may be performed according to the offset position of the modified file recorded by each node. After each node of the binary tree is ordered, the node of the binary tree is further inserted into the B+ tree, and when the number of key values corresponding to leaf nodes of the B+ tree exceeds a limit, the node is split.
The process of converting a binary tree containing three nodes into a B + tree is shown as a diagram. In the binary tree, modifications of 3 file blocks are recorded, with corresponding file offset positions of 20, 50, 300, respectively. The b+ tree records modifications of 10 file blocks, corresponding to file offset positions of 10, 30, 60, 70, 80, 100, 400, 500, 700, 800, respectively.
In performing the insert operation, the insert operation of node 20 is performed first. When performing node 20 insert operation, the leftmost node insert node 20 of the b+ tree is found. Thereafter, the nodes 50 and 300 are inserted again in sequence. And when the node is inserted every time, further judging whether the number of the current node key values exceeds a limit, for example, judging whether the number of the current node key values is smaller than or equal to m-1 and smaller than or equal to m-1, wherein m is the order of the B+ number. When the number of the current node key values does not exceed the limit, no splitting processing is performed. And when the number of the current node key values exceeds the limit, splitting is performed. If the number of the key values of the current node is less than or equal to m-1, no splitting treatment is performed. And when the number of the key values of the current node is larger than m-1, splitting is carried out.
After the binary tree is converted into the B+ tree, the data of the corresponding log space can be directly indexed by the B+ tree, that is, the log space corresponding to the binary tree index is used as the data space of the B+ tree, so that the conversion between the log space and the data space is realized. After the binary tree is converted into the B+ number, the nodes in the binary tree can be further deleted.
An embodiment of the present invention discloses an electronic device, which may include, but is not limited to: personal computers, server computers, workstations, desktop computers, laptop computers, notebook computers, mobile electronic devices, smart phones, tablet computers, cellular phones, personal Digital Assistants (PDAs), handsets, messaging devices, wearable electronic devices, consumer electronic devices, and the like. The electronic equipment can realize the data processing method of the file system, improve the read-write performance of the file system and prolong the service life of the solid state disk. In particular, the electronic device comprises at least one memory, at least one processor, and a computer program, the at least one memory being coupled to the at least one processor, wherein the computer program is stored in the memory and is executable in the processor, such as the computer program being a data processing program or the like. In practice, the processor may implement various steps in the above method when executing the computer program, such as a step of converting a binary tree index into a b+ tree index in response to a preset condition being satisfied, a step of converting a log space into a data space based on the b+ tree index, and so on.
The computer program herein may be divided into one or more units, which are stored in and executed by the memory to accomplish the present invention. Wherein one or more of the units may be a series of computer program instruction segments capable of performing the specified functions, the computer program instruction segments being adapted to describe the execution of a computer program in said electronic device.
It should be noted that the electronic device herein includes, but is not limited to, a memory, a processor, and a computer program as described above, but may also include other devices, such as an input device (e.g., a keyboard, etc.) for inputting instructions, a display screen for displaying negotiation results, a communication interface, etc., which communicate with each other via a bus.
The invention also discloses a computer readable storage medium, wherein the computer readable storage medium stores a computer program, and when the computer program is executed by a processor, the data processing method of the file system can be realized, the read-write performance of the file system can be improved, and the service life of the solid state disk can be prolonged. Wherein the computer program comprises computer program code, which may be in source code form, executable file or in some intermediate form, etc., the computer readable medium may comprise any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer memory, a Read-only memory (ROM), a random access memory (RAM, randomAccessMemory), etc.
It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The foregoing descriptions of specific exemplary embodiments of the present invention are presented for purposes of illustration and description. It is not intended to limit the invention to the precise form disclosed, and obviously many modifications and variations are possible in light of the above teaching. The exemplary embodiments were chosen and described in order to explain the specific principles of the invention and its practical application to thereby enable one skilled in the art to make and utilize the invention in various exemplary embodiments and with various modifications as are suited to the particular use contemplated. It is intended that the scope of the invention be defined by the claims and their equivalents.

Claims (12)

1. A data processing method of a file system, the data processing method comprising:
receiving and responding to the writing operation request, selecting a log space based on a binary tree index and writing corresponding data;
judging whether a preset condition is met or not;
in response to a preset condition being met, the binary tree index is converted to a b+ tree index for indexing the data space to convert the log space to the data space.
2. The data processing method of claim 1, wherein each node of the binary tree stores file block information for recording file modification information including an offset position of a modified file, a modified length, a segment number in which modified data is stored, and an offset position of modified data in a segment.
3. The data processing method of claim 1, wherein the predetermined condition is selected from one of a log space being full, a predetermined time having elapsed after a write operation.
4. The data processing method of claim 1, wherein the binary tree is an AVL tree.
5. The data processing method of claim 2, wherein converting the binary tree index to a b+ tree index comprises:
inserting a node of the binary tree into a node of the b+ tree;
judging whether the number of key values in the nodes of the current B+ tree exceeds a number limit;
in response to the number limit being exceeded, the current node is split.
6. The data processing method of claim 5, wherein converting the binary tree index to a b+ tree index further comprises:
each node of the binary tree is also ordered before inserting the node of the binary tree into the node of the B + tree.
7. The data processing method of claim 6, wherein each node of the binary tree is ordered by an offset position of the modified file.
8. The data processing method according to claim 1, wherein the data processing method further comprises:
receiving and responding to the read operation request, and searching a corresponding log space through a binary tree;
judging whether corresponding data exists in the log space or not;
in response to the log space not having data, the data is retrieved by looking up the data space through the B+ tree.
9. The data processing method according to claim 1, wherein the data processing method further comprises:
recording segments used as log space;
after converting the log space to the data space, the record is purged.
10. A data processing apparatus of a file system, the data processing apparatus comprising:
the storage module is used for receiving and responding to the writing operation request, selecting a log space based on a binary tree index and writing corresponding data;
the judging module is used for judging whether preset conditions are met or not;
and the conversion module is used for converting the binary tree index into a B+ tree index for indexing the data space in response to the satisfaction of the preset condition so as to convert the log space into the data space.
11. An electronic device, the electronic device comprising:
at least one processor; and
at least one memory coupled to the at least one processor and storing a computer program for execution by the at least one processor, the computer program, when executed by the at least one processor, causing the electronic device to perform the method of any one of claims 1 to 9.
12. A computer-readable storage medium, characterized in that a computer program is stored thereon, which computer program, when being executed by a machine, implements the method of any of claims 1 to 9.
CN202310599874.2A 2023-05-25 2023-05-25 Data processing method and device of file system, electronic equipment and storage medium Pending CN116610636A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310599874.2A CN116610636A (en) 2023-05-25 2023-05-25 Data processing method and device of file system, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310599874.2A CN116610636A (en) 2023-05-25 2023-05-25 Data processing method and device of file system, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116610636A true CN116610636A (en) 2023-08-18

Family

ID=87685074

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310599874.2A Pending CN116610636A (en) 2023-05-25 2023-05-25 Data processing method and device of file system, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116610636A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118210760A (en) * 2024-05-20 2024-06-18 四川大学 Backup IO log indexing method, system and storage medium based on B tree

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118210760A (en) * 2024-05-20 2024-06-18 四川大学 Backup IO log indexing method, system and storage medium based on B tree
CN118210760B (en) * 2024-05-20 2024-07-19 四川大学 Backup IO log indexing method, system and storage medium based on B tree

Similar Documents

Publication Publication Date Title
CN108319654B (en) Computing system, cold and hot data separation method and device, and computer readable storage medium
US8965850B2 (en) Method of and system for merging, storing and retrieving incremental backup data
US8108446B1 (en) Methods and systems for managing deduplicated data using unilateral referencing
US20170123676A1 (en) Reference Block Aggregating into a Reference Set for Deduplication in Memory Management
US10303363B2 (en) System and method for data storage using log-structured merge trees
US11580162B2 (en) Key value append
US7636736B1 (en) Method and apparatus for creating and using a policy-based access/change log
US11221999B2 (en) Database key compression
US20170123678A1 (en) Garbage Collection for Reference Sets in Flash Storage Systems
US20170123689A1 (en) Pipelined Reference Set Construction and Use in Memory Management
CN112182010B (en) Dirty page refreshing method and device, storage medium and electronic equipment
US20170123677A1 (en) Integration of Reference Sets with Segment Flash Management
CN116610636A (en) Data processing method and device of file system, electronic equipment and storage medium
EP3343395B1 (en) Data storage method and apparatus for mobile terminal
CN114020193B (en) Page crossing hook determination method and device, electronic equipment and storage medium
CN114942863A (en) Cascade snapshot processing method, device and equipment and storage medium
CN105808451B (en) Data caching method and related device
CN111831691A (en) Data reading and writing method and device, electronic equipment and storage medium
US20170371563A1 (en) Method for retrieving data from a tape drive
CN106156038B (en) Date storage method and device
US9111015B1 (en) System and method for generating a point-in-time copy of a subset of a collectively-managed set of data items
US9646014B1 (en) Systems and methods for selective defragmentation
CN115328696A (en) Data backup method in database
CN111857604A (en) Method, apparatus, device and medium for quickly reconstructing packet management mapping reverse lookup table
CN111625500A (en) File snapshot method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination