CN116708420B - Method, device, equipment and medium for data transmission - Google Patents

Method, device, equipment and medium for data transmission Download PDF

Info

Publication number
CN116708420B
CN116708420B CN202310934180.XA CN202310934180A CN116708420B CN 116708420 B CN116708420 B CN 116708420B CN 202310934180 A CN202310934180 A CN 202310934180A CN 116708420 B CN116708420 B CN 116708420B
Authority
CN
China
Prior art keywords
database
difference information
information
file
metadata
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310934180.XA
Other languages
Chinese (zh)
Other versions
CN116708420A (en
Inventor
李尔康
刘昌鑫
程林
边国伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Lenovo Netapp Technology Ltd
Original Assignee
Lenovo Netapp Technology Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Lenovo Netapp Technology Ltd filed Critical Lenovo Netapp Technology Ltd
Priority to CN202310934180.XA priority Critical patent/CN116708420B/en
Publication of CN116708420A publication Critical patent/CN116708420A/en
Application granted granted Critical
Publication of CN116708420B publication Critical patent/CN116708420B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/06Protocols specially adapted for file transfer, e.g. file transfer protocol [FTP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network

Abstract

The present disclosure discloses a method, apparatus, device and medium for data transmission. The method may include: acquiring metadata difference information, wherein the metadata difference information is generated based on data update; storing the metadata difference information as database files which can be stored in the first database, wherein the database files are smaller than the metadata difference information in quantity; the database file is transmitted to the second device. The method provided by the disclosure can store the acquired metadata difference information as the database file which can be stored in the database, wherein the data volume of the database file is smaller than that of the metadata difference information, and then the database file is sent to one or more devices in the slave cluster, so that the method provided by the disclosure can send all metadata difference information with lower data volume, and has the advantages of high sending efficiency, network resource saving, unaffected data synchronization efficiency, convenience in retrieval processing from the cluster and the like.

Description

Method, device, equipment and medium for data transmission
Technical Field
The present disclosure relates to the field of data transmission, and more particularly, to a method, apparatus, device, and medium for data transmission.
Background
Along with the advancement of the digitizing process, the data gradually becomes an operation core of each industry, so that the stability requirement of users on the distributed storage system for storing mass data is higher and higher.
In order to ensure business continuity, data backup, and disaster recovery, support for remote copy features is required in distributed storage systems. The remote replication can be divided into synchronous remote replication and asynchronous remote replication, wherein the synchronous remote replication has relatively high requirements on the distance between the physical devices of the master and slave clusters and the network bandwidth, so that asynchronous remote replication is often adopted on the market, that is, service data of the master cluster storage system is periodically synchronized into the slave cluster storage system, so that the master cluster is damaged by unrecoverable under the condition of uncontrollable factors (such as natural disasters), and the slave cluster support service can still be used.
In supporting the remote copy feature, the main market technology creates a snapshot of the start time and the end time in one period of the main cluster by using the snapshot technology, compares the metadata differences generated in the period by the snapshot, and synchronizes the metadata differences to the slave cluster for storage.
However, the existing conventional method has a complicated flow when transmitting the metadata difference, and transmits the metadata difference to the slave cluster in a piece-by-piece or batch manner, which has low efficiency, wastes network resources, and affects data synchronization efficiency, and is inconvenient to retrieve and process when the metadata difference is applied from the slave cluster due to the fact that the metadata difference is transmitted in a piece-by-piece or batch manner.
Therefore, a method for data transmission is urgently needed to solve the above-described problems.
Disclosure of Invention
In view of the foregoing, the present disclosure provides a method for data transmission. The method provided by the disclosure can store the acquired metadata difference information as the database file which can be stored in the database, wherein the data volume of the database file is smaller than that of the metadata difference information, and then the database file is sent to one or more devices in the slave cluster, so that the method provided by the disclosure can send all metadata difference information with lower data volume.
The disclosed embodiments provide a method for data transmission, the method performed by a first device, and the method comprising: acquiring metadata difference information, wherein the metadata difference information is generated based on data update; storing metadata difference information as database files which can be stored in a first database, wherein the database files are smaller than the metadata difference information in quantity; transmitting the database file to a second device.
According to an embodiment of the present disclosure, the obtaining metadata difference information includes: obtaining snapshot information of a first file system storing the data from at least one metadata service; and acquiring the metadata difference information based on the snapshot information.
According to an embodiment of the present disclosure, the method further comprises: creating a second file system for storing metadata difference information; the first database is created based on the created second file system.
According to an embodiment of the present disclosure, the transmitting the database file to the second device includes: acquiring directory structure information and database files of the first database in the second file system; transmitting the directory structure information to a second device; and transmitting the database file to the second device after transmitting the directory structure information.
According to an embodiment of the present disclosure, the method further comprises: creating a second database based on the created second file system, wherein the second database is used for storing interrupt information of metadata difference information; and when interruption occurs in the process of acquiring the metadata difference information, storing the interruption information of the metadata difference information in the second database.
According to an embodiment of the present disclosure, the transmitting the database file to the second device includes: transmitting a portion of the database file to the second device each time and storing information relating to the transmission of the portion in the second database.
The disclosed embodiments provide a method for data transmission, the method performed by a second device, and the method comprising: receiving directory structure information of a first database in a pre-created file system, wherein the first database is a database on a first device for storing metadata difference information, wherein the metadata difference information is generated based on data update; creating a third database based on the directory structure information; receiving a database file from the first device, wherein the database file is obtained by storing the metadata difference information in a first database; the metadata difference information is stored in the third database based on the database file.
An embodiment of the present disclosure provides an apparatus for data transmission, including: an acquisition unit configured to acquire metadata difference information, wherein the metadata difference information is generated based on data update; a first storage unit configured to store metadata difference information as a database file that a first database can store, wherein the database file is smaller in amount than the metadata difference information; a transmission unit configured to transmit the database file to the second device.
According to an embodiment of the disclosure, the acquiring unit is configured to: obtaining snapshot information of a first file system storing the data from at least one metadata service; and acquiring the metadata difference information based on the snapshot information.
According to an embodiment of the present disclosure, the apparatus further includes: a first creation unit configured to: creating a second file system for storing metadata difference information; the first database is created based on the created second file system.
According to an embodiment of the disclosure, the transmission unit is configured to: acquiring directory structure information and database files of the first database in the second file system; transmitting the directory structure information to a second device; and transmitting the database file to the second device after transmitting the directory structure information.
According to an embodiment of the present disclosure, the apparatus further includes: a second creation unit configured to create the second database based on the created second file system, wherein the second database is used for storing interrupt information of metadata difference information; and a second storage unit configured to store, when an interruption occurs in the process of acquiring the metadata difference information, interruption information of the metadata difference information in the second database.
According to an embodiment of the present disclosure, the transmission unit is configured to: transmitting a portion of the database file to the second device each time and storing information relating to the transmission of the portion in the second database.
The embodiment of the disclosure provides a device for data transmission, which comprises: a first receiving unit configured to receive directory structure information of a first database in a file system created in advance, wherein the first database is a database on a first device for storing metadata difference information, wherein the metadata difference information is generated based on data update; a creation unit configured to create a third database based on the directory structure information; a second receiving unit configured to receive a database file from the first device, wherein the database file is obtained by storing the metadata difference information in a first database; a storage unit configured to store the metadata difference information in the third database based on the database file.
The embodiment of the disclosure provides a device for data transmission, comprising: means for acquiring metadata difference information, wherein the metadata difference information is generated based on a data update; means for storing metadata difference information as database files that a first database is capable of storing, wherein the database files are smaller in amount than the metadata difference information; and means for transmitting the database file to a second device.
The embodiment of the disclosure provides a device for data transmission, comprising: means for receiving directory structure information of a first database in a pre-created file system, wherein the first database is a database on a first device for storing metadata difference information, wherein the metadata difference information is generated based on a data update; means for creating a third database based on the directory structure information; means for receiving a database file from the first device, wherein the database file is obtained by storing the metadata difference information in a first database; and means for storing the metadata difference information in the third database based on the database file.
The embodiment of the disclosure provides a device for data transmission, comprising: a processor, and a memory storing computer executable instructions that, when executed by the processor, cause the processor to perform the method as described above.
The disclosed embodiments provide a computer-readable recording medium storing computer-executable instructions, wherein the computer-executable instructions, when executed by a processor, cause the processor to perform the method as described above.
The present disclosure provides a method, apparatus, device, and medium for data transmission. The method provided by the disclosure can store the acquired metadata difference information as the database file which can be stored in the database, wherein the data volume of the database file is smaller than that of the metadata difference information, and then the database file is sent to one or more devices in the slave cluster, so that the method provided by the disclosure can send all metadata difference information with lower data volume. In addition, the method provided by the disclosure also has breakpoint transmission capability, so that the method provided by the disclosure has the advantages of better efficiency and convenience, wider application prospect and the like compared with the prior art.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings that are required to be used in the description of the embodiments will be briefly described below. It should be apparent that the drawings in the following description are only some exemplary embodiments of the present disclosure, and that other drawings may be obtained from these drawings by those of ordinary skill in the art without undue effort.
FIG. 1 illustrates a directory entry diagram of a file system according to an embodiment of the present disclosure;
fig. 2 illustrates a flow chart of a method 200 for data transmission according to an embodiment of the present disclosure;
fig. 3 illustrates a flow chart of a method 300 for data transmission according to an embodiment of the present disclosure;
FIG. 4 illustrates a schematic diagram of the acquisition and storage of metadata difference information performed by a first device in accordance with an embodiment of the present disclosure;
FIG. 5 illustrates a schematic diagram of transmitting metadata differential information by a first device to a second device in accordance with an embodiment of the present disclosure;
fig. 6 shows a block diagram of an apparatus 600 for data transmission according to an embodiment of the disclosure;
fig. 7 shows a block diagram of an apparatus 700 for data transmission according to an embodiment of the disclosure;
fig. 8 is a block diagram illustrating an apparatus 800 for data transmission according to an embodiment of the present disclosure;
fig. 9 shows a schematic diagram of a recording medium 900 according to the present disclosure.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present disclosure more apparent, the technical solutions of the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings of the embodiments of the present disclosure. It will be apparent that the described embodiments are some, but not all, of the embodiments of the present disclosure. All other embodiments, which can be made by one of ordinary skill in the art without the need for inventive faculty, are within the scope of the present disclosure, based on the described embodiments of the present disclosure.
Unless defined otherwise, technical or scientific terms used in this disclosure should be given the ordinary meaning as understood by one of ordinary skill in the art to which this disclosure belongs. The terms "first," "second," and the like, as used in this disclosure, do not denote any order, quantity, or importance, but rather are used to distinguish one element from another. Likewise, the terms "a," "an," or "the" and similar terms do not denote a limitation of quantity, but rather denote the presence of at least one. The word "comprising" or "comprises", and the like, means that elements or items preceding the word are included in the element or item listed after the word and equivalents thereof, but does not exclude other elements or items. The terms "connected" or "connected," and the like, are not limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect. "upper", "lower", "left", "right", etc. are used merely to indicate relative positional relationships, which may also be changed when the absolute position of the object to be described is changed.
In the prior art, metadata difference information is sent from a master cluster to a slave cluster either piece by piece or in batches. The existing transmission mode has the problems of low transmission efficiency, relatively wasting network resources, influencing data synchronization efficiency, being inconvenient to retrieve from a cluster, and the like.
In order to solve the above-described problems, the present disclosure provides a method for data transmission. The method provided by the disclosure can store the acquired metadata difference information as the database file which can be stored in the database, wherein the data volume of the database file is smaller than that of the metadata difference information, and then the database file is sent to one or more devices in the slave cluster, so that the method provided by the disclosure can send all metadata difference information with lower data volume.
The method for data transmission provided in the present disclosure will be described in detail with reference to the accompanying drawings.
Before the method provided in this disclosure, information related to directory entries of file systems running in a master cluster and a slave cluster is described in a simple example to facilitate subsequent understanding of the method provided in this disclosure.
FIG. 1 illustrates a directory entry diagram of a file system according to an embodiment of the present disclosure.
According to embodiments of the present disclosure, a file system generally includes data and metadata.
As an example, the metadata may be data (data about data) for describing the data. Metadata may be information that primarily describes data attributes to support functions such as indicating storage locations, historical data, resource lookups, file records, and the like.
As an example, metadata of a file system may include directory entries (directory) and inodes (inodes). Directory entries may be used to describe the organization of files or folders in a file system, with inodes being attribute information for the files or folders. In a file system, each file or folder corresponds to an inode as a unique number for that file or folder. The file system accesses information of the inode through an inode number (inode).
For example, the directory entries in the metadata record information such as file names and superior directories, and all the directory entries are connected in a parent directory/child directory manner, so that a tree structure reflecting file system organization information can be formed. For example, in some examples, as shown in fig. 1, a tree structure reflecting organization information of the file system may be obtained by using directory entries of the file system, and then information of corresponding index nodes may be obtained by using index node numbers of files or folders dir1, dir2, dir3, file1, file2, so that complete information of the file system may be read. For example, in this example, dir1, dir2, dir3 represent folders, and file1, file2 represent files.
According to embodiments of the present disclosure, metadata of a file system provided by the present disclosure may include a plurality of directory entry entries and a plurality of inode entries. For example, the directory entry is the directory entry (directory) described above, and the index node entry is the index node (inode) described above. Directory entry entries and inode entries are typically recorded in key-value (KV) format.
The plurality of directory entry entries may include at least one native directory entry and at least one snapshot directory entry, i.e., the plurality of directory entry entries are divided into two categories, a native directory entry and a snapshot directory entry. The native directory entry may document current organization information for the file system. The snapshot directory entry may document organization information that may not currently exist in the file system that existed at the time the snapshot was generated.
The plurality of inode entries may include at least one native inode entry and at least one snapshot inode entry, i.e., the plurality of inode entries are divided into two categories, a native inode entry and a snapshot inode entry. The native node entry may document current file attribute information of the file system. The snapshot node entry may document file attribute information that may not currently exist in the file system that existed at the time the snapshot was generated.
The method for data transmission provided by the present disclosure is focused on snapshot information of a file system, and the method for data transmission provided by the present disclosure will be described in detail based thereon.
Fig. 2 illustrates a flow chart of a method 200 for data transmission according to an embodiment of the present disclosure. As an example, the method 200 may be performed by a first device, e.g., a device running a primary service in a primary cluster. On this first device is running the file system comprising data and metadata described above in connection with fig. 1.
Referring to fig. 2, metadata difference information may be acquired at step S210, wherein the metadata difference information may be generated based on data update.
As an example, based on the above-described new addition, modification, or deletion of data in the file system, corresponding metadata difference information is generated. For example, when data in the file system is deleted, metadata corresponding to the data is also deleted, and metadata difference information generated at this time includes the above-described deleted metadata information. The metadata difference information may be embodied in the form of KV key pairs, for example, KV key pairs with deletion identifiers, modification identifiers, or newly added identifiers.
As an example, the metadata difference information may be acquired in advance according to actual conditions, or may be acquired in real time.
According to an embodiment of the present disclosure, the acquiring metadata difference information may include: obtaining snapshot information of a first file system storing the data from at least one metadata service; and acquiring the metadata difference information based on the snapshot information.
As an example, multiple metadata services may be distributed in a primary cluster. Each of the plurality of metadata services may have stored therein snapshot information of a portion of the first file system (i.e., the file system described above in connection with fig. 1), which may include the snapshot directory entry entries and snapshot inode entries described in connection with fig. 1. For example, a snapshot, such as snapshot 1 corresponding to the start time and snapshot 2 corresponding to the end time, may be created at the start time and the end time, respectively, within one period of the master cluster and the slave cluster (the period being a period defined in advance according to the actual situation), and then metadata difference information may be acquired based on comparison of snapshot 1 and snapshot 2. For example, in the case where data in the first file system is deleted, the data still exists in snapshot 1 and the data does not already exist in snapshot 2, the metadata difference information obtained may include information related to the metadata of the deleted data.
It should be noted that, as is well known to those skilled in the art, a snapshot refers to a completely available copy of a specified data set, which includes an image of the corresponding data at a certain point in time (the point in time when the copy begins). The snapshot has the main function of being capable of carrying out data backup and recovery. When the storage device has application faults or file damage, the quick data recovery can be performed, and the data is recovered to a state of a certain available time point, so that the requirements of enterprises on service continuity and data reliability can be met.
With continued reference to fig. 2, at step S220, the metadata difference information may be stored as a database file that the first database is capable of storing, wherein the database file is smaller in amount than the metadata difference information.
As an example, the first database may be any suitable database, such as a high performance Key Value (KV) persistence based database, such as a Rocksdb based database. The metadata difference information acquired in step S210 is stored in the database, and the resulting database file is generally much smaller than the data size of the metadata difference information. This is because the metadata difference information is compressed and stored in the database, so that the metadata difference information is transmitted faster than the conventional metadata difference information which is directly transmitted later, and more network resources are saved.
In step S230, the database file may be transmitted to the second device.
As an example, the second device may be any device in the slave cluster, e.g. a device in the slave cluster running a master service.
As an example, the database file may be transmitted to the second device by any wired or wireless means according to actual needs.
According to an embodiment of the present disclosure, the method 200 may further include creating a second file system for storing metadata difference information; the first database is created based on the created second file system.
By way of example, the second file system may be any suitable lightweight file system that can interact with the database described above, such as a BlueFS-based file system. The second file system may be running on top of the first file system. The first database may be created on top of the second file system and then metadata difference information may be stored in the first database.
In the above case, the transmitting the database file to the second device may include: acquiring directory structure information and database files of the first database in the second file system; transmitting the directory structure information to a second device; and transmitting the database file to the second device after transmitting the directory structure information.
As an example, the first database may be a database file stored in the second file system in various file forms including directory structure information of various directory file configuration files, installation directory files, and the like of the first database, and data actually stored in the database (e.g., metadata difference information described above). After transmitting the directory structure information to the second device, the second device may build a database on the second device that is identical to the first database based on the directory structure information. The first device then transmits the database file to the second device after transmitting the directory structure information, and the second device may store metadata difference information included in the database file in the same database as the first database constructed in the second device based on the database file, thereby causing the second device to obtain the metadata difference information.
According to an embodiment of the present disclosure, the method 200 may further include: creating a second database based on the created second file system, wherein the second database is used for storing interrupt information of metadata difference information; and when interruption occurs in the process of acquiring the metadata difference information, storing the interruption information of the metadata difference information in the second database.
As an example, the second database may be any suitable database, such as a high performance Key Value (KV) based persistent database. The interruption may be caused by various reasons, for example by a temporary malfunction of the hardware or software of the first device, and for example by the need to temporarily stop the transmission of the metadata difference information, as a result of a higher priority service being invoked. When an interruption occurs in the process of acquiring the metadata difference information, the interruption information is stored in the second database, wherein the interruption information can comprise information related to convenience for continuous transmission of the metadata difference information, such as whether the metadata difference information is transmitted completely when the interruption occurs, metadata difference identification of an interruption point (such as offset relative to initial transmission of the metadata difference information), on which metadata service the interruption occurs, and the like. The method provided by the disclosure has the breakpoint retransmission function, and further has a wide application prospect.
According to an embodiment of the present disclosure, the transmitting the database file to the second device may include: transmitting a portion of the database file to the second device each time and storing information relating to the transmission of the portion in the second database.
As an example, asynchronous remote replication between master and slave clusters may take days or months due to reality. In the above case, the database file stored in S220 is often relatively large, and once transmitting the database file in this case easily brings about a great adjustment to the service on the device of the master cluster or the slave cluster. To this end, the method provided by the present disclosure further provides a breakpoint resume mechanism, i.e., a large database file may be divided into several segments (e.g., divided in a fixed size of 4M) in advance, and then only one segment (i.e., a portion of the database file) is transmitted to the second device at a time and information related to the transmission of the one segment is stored in the second database, where the information related to the transmission of the one segment may include information related to a database file name, a portion that has been transmitted, an identification of whether the transmission has been completed, and so on, which are convenient to continue with the remaining portion of the database file. Under the breakpoint resume mechanism, only a portion of the database file may be transmitted at a time until the database file is transmitted. This makes the method provided by the present disclosure more efficient and convenient than existing methods.
The method for data transmission performed by the first device has been described above in connection with fig. 1-2. The method provided by the present disclosure can transmit all metadata difference information at a time with a lower data volume and has the capability of breakpoint transmission. Compared with the prior art, the method provided by the disclosure has the advantages of high transmission efficiency, network resource saving, unaffected data synchronization efficiency, convenience in cluster retrieval processing, wide application prospect and the like.
The present disclosure provides methods for data transmission performed by a second device in addition to the methods performed by a first device described above. This will be described in detail with reference to fig. 3.
Fig. 3 illustrates a flow chart of a method 300 for data transmission according to an embodiment of the present disclosure. As an example, the method 300 may be performed by a second device, e.g., by a device running a master service in a slave cluster. On this second device a file system comprising data and metadata as described above in connection with fig. 1 is run.
Referring to fig. 3, directory structure information of a first database in a pre-created file system may be received at step S310, wherein the first database is a database on a first device for storing metadata difference information, wherein the metadata difference information is generated based on data update.
As an example, the first database may be the first database described in connection with fig. 2. The pre-created file system may be the second file system described in connection with fig. 2. The directory structure information may be the directory structure information described in connection with fig. 2. The first device may be the first device described in connection with fig. 2. The metadata difference information may be the metadata difference information described in connection with fig. 2. Their details are not described here in detail.
In step S320, a third database may be created based on the directory structure information.
As an example, the third database may be any suitable database, such as a high performance Key Value (KV) persistence based database, such as a Rocksdb based database.
The third database created based on the received directory structure information is consistent with the first database running on the first device. This allows the master cluster and the slave cluster to have the same database environment.
In step S330, a database file may be received from the first device, wherein the database file is obtained by storing the metadata difference information in a first database. In step S340, the metadata difference information may be stored in the third database based on the database file.
As an example, the third database created in the second device may parse out the metadata difference information stored in the database file, which may then be stored in the third database of the second device. Finally, the second device can apply the metadata difference information in the second device, so that the metadata of the first device and the metadata of the second device are consistent. At this time, the data corresponding to the metadata can be kept consistent between the first device and the second device, thereby enabling asynchronous remote copying.
The method for data transmission provided by the present disclosure is described in detail above in connection with fig. 1-3. As can be seen from the foregoing description, the present disclosure actually provides a method for transmitting metadata differences between remote copy snapshots based on metadata differences (such as snapshot differences) and databases, so as to meet the requirements of remote copy and save metadata differences and convenient transmission of a distributed file storage system.
Compared with the traditional transmission method, the method provided by the present disclosure uniformly stores the acquired metadata difference information in one database (such as the database based on the Rocksdb), then uses the file composition mode of the database on a lightweight file system (such as the optional file system based on the blue fs) to read the database from the file system as files, and transmits the files to a file system of a slave cluster (such as the file system based on the blue fs established from the cluster), and the slave cluster can open the files as the database and acquire the metadata difference according to KV from the database, so as to be applied to the distributed file system of the slave cluster.
The method provided by the disclosure adopts a mode of directly transmitting the database files, because the database files are smaller after being compressed and stored in the database, and the number of the database files is far smaller than the number of metadata differences. In addition, all metadata differences in the synchronization can be read from the cluster at one time, and convenience is provided for the aspects of metadata statistics, application consistency and the like.
To facilitate a better understanding of the above-described methods provided by the present disclosure. The above-described methods provided by the present disclosure will be further described, by way of example.
Fig. 4 illustrates a schematic diagram of the acquisition and storage of metadata difference information performed by a first device according to an embodiment of the present disclosure.
Referring to fig. 4, a service process, such as a Replication service process, or REP, dedicated to remotely replicating metadata difference information to a second device may be run on a first device. Two metadata services (Meta data Service, MDS), namely metadata services mds_1 401 and mds_2 402, may be deployed for a file system running on a first device (e.g., the first file system described above in connection with fig. 2), one as a primary service (e.g., mds_1 401 in fig. 4) running a snapshot and the other (e.g., mds_2 402 in fig. 4) as a secondary service. The two metadata services may be located together on one physical machine or may be located on two separate physical machines that communicate with each other.
When metadata difference information needs to be obtained for transmission to the second device, the first device needs to obtain the metadata difference information and persist into the first database as described above in connection with fig. 2.
The first device may first create an empty database on the REP service, specifically may first create a virtual physical storage space (e.g., RADOS Block Device, RBD, a reliable autonomous distributed object storage block device) (it should be noted that creating a virtual physical storage space is optional; a second file system (e.g., bluefs) as described above in connection with fig. 2 is then created on the virtual physical storage space, after which the second file system is mounted (or run) and an empty first database (e.g., a Rocksdb-based database) as described above in connection with fig. 2 is created and opened via, for example, a Bluefs interface.
Because the metadata is distributed on each MDS node on the main cluster, a request for obtaining the metadata difference message can be sent to each MDS node, then a metadata difference KV key value pair is obtained according to the related algorithm for generating the metadata difference, and finally the KV key value pair is saved in the first database. When the REP and MDS service communicate to obtain the metadata difference information, the REP and each MDS can communicate in a parallel mode in consideration of the limitation of the data volume carried by the network bandwidth and the time consumption of message communication. In addition, considering service failure in the communication process, a breakpoint continuous mechanism needs to be implemented. To this end a second database as described above in connection with fig. 2 is created again on the REP for storing breakpoint information. The REP obtains a number of KV key pairs (i.e., metadata difference information) from each MDS (mds_1 401 and mds_2 402 as will be described below). When the MDS returns the key value pairs of KV obtained, the returned message body contains the identification of whether the MDS is completed or not; if not, the currently processed key is returned, and the REP persists the key-value, the identification of whether the key is completed and the processed key information into the database after receiving the message. If not, continuing to send a message for acquiring the metadata difference information to the MDS and comprising the key processed last time, and after the MDS receives the message, continuing to traverse the metadata database on the MDS from the next piece of key. If the metadata difference information is faulty in the process of communication between the REP and the MDS, the interrupt information can be read from the second database after the REP is restarted, and the process is continuously completed.
Specifically, referring to fig. 4, rep first optionally creates RBD and optionally creates BlueFS file system. Then the first database (snapdiff db) and the second database (control db) are created. Next, REP sends a request to mds_1 401 to obtain metadata difference information. MDS_1 401 retrieves key-value pairs based on the request using any relevant retrieval algorithm that generates metadata differences, where the algorithm may retrieve metadata difference information based on snapshot information in MDS_1 401, such as based on a comparison between snapshot 1 and snapshot 2 stored in MDS_1 401. The mds_1 401 then returns metadata difference information to the REP, an indication of whether the mds_1 401 is complete, and optionally the processed key when incomplete, as in the interrupt case described above. After the REP receives the information, the key-value key value pair is stored in the snapdiff_db data, and the identification and the processed key are stored in the control_db database. If all metadata difference information is not obtained from mds_1 401, then a request for obtaining metadata difference information and processed key information are continuously sent to mds_1 401 until all metadata difference information is obtained from mds_1 401.
The REP then sends a request to MDS_2 402 to retrieve metadata differential information. MDS_2 402 obtains key-value pairs based on the request using any relevant algorithm that obtains a metadata difference, which may obtain metadata difference information based on snapshot information in MDS_2 402, such as based on a comparison between snapshot 1 and snapshot 2 stored in MDS_2 402. The MDS_2 402 then returns metadata difference information to the REP, an indication of whether the MDS_2 402 is complete, and optionally the processed key when incomplete, as in the interrupt scenario described above. After the REP receives the information, the key-value key value pair is stored in the snapdiff_db data, and the identification and the processed key are stored in the control_db database. If all metadata difference information is not obtained from mds_2 402, then a request for obtaining metadata difference information and processed key information is continuously sent to mds_2 402 until all metadata difference information is obtained from mds_2 402.
Finally, after waiting for all MDSs (i.e., mds_1 and mds_2, 401) to obtain all metadata difference information successfully, all metadata difference information is stored in the first database.
Fig. 5 shows a schematic diagram of transmission of metadata difference information by a first device to a second device according to an embodiment of the present disclosure.
Referring to fig. 5, the transmission of the metadata difference information is mainly that the REP on the first device transmits the database file carrying the metadata difference information to the REP on the second device, and the second device may open a database consistent with the first database and perform a read/write operation in the same manner. To this end, the locksdb database has been created and opened under the Bluefs file system on the REP of the first device, and the optional RBD and Bluefs can be created in the same way on the second device.
The REP on the first device reads the files and directories related to the first database on the current RBD through a read interface (e.g., readdir interface) of the bluffs, records the number and size of the files, and then reads the file data information (i.e., the database file described above in connection with fig. 2) through another read interface (e.g., read interface) of the bluffs according to the records. The REP on the first device firstly transmits the directory structure information to the REP on the second device, the second device REP creates a directory through a Bluetooth creation directory interface (such as an mkdir interface), then the REP on the first device transmits the read database file data to the REP on the second device, and the REP on the second device additionally writes the data carried by the REP into the corresponding file through a Bluetooth opening interface (such as an open interface) and an additional interface (such as an add interface). After all files are read and sent successfully, the REP on the first device sends a message (e.g., a finish message) to the REP on the second device informing of the successful completion, which represents the completion of the database file transfer. Because the database creation and opening modes at the two ends of the first device and the second device are the same, the directory structure and the main file are consistent, and REP on the first device and the second device can also open the transmitted rocksdb database under Bluefs.
The specific flow is shown in fig. 5.
First, there may already be snapdiff db and control db as described above in connection with fig. 4 on the REP (i.e. master REP) on the first device. The master REP obtains the file and directory information on the RBD through the readdir interface, i.e. obtains the database file and directory structure information associated with the first database, as described above in connection with fig. 2. The slave REP may create an optional RBD and a bluffs file system and mount (or run) the bluffs file system before the master REP sends directory structure information to the REP on the second device (i.e., the slave REP).
Then, the master REP transmits directory structure information to the slave REP. The slave REP creates the same directory on the second device through the mkdir interface based on the received directory structure information to obtain a database (i.e. the third database described above in connection with fig. 3) that is identical to the first database running on the first device. After the creation is completed, the slave REP transmits a response of successful creation to the master REP.
Next, the master REP opens the database file through the open interface and reads the database file data through the read interface (i.e., the open file and read file data shown in FIG. 5). The read database file is then transferred to the slave REP. The slave REP opens the database file through the open interface to acquire the handle (i.e., the start position of the file), and then starts writing metadata difference information in the database file into the third database through the write interface at the slave REP (i.e., the open file acquisition handle and write file data shown in fig. 5). The slave REP then sends a message to the master REP that the writing was successful. If the slave REP does not read the identifier of the end of the database file, it indicates that all the information included in the database file is not completely sent to the slave, at this time, the master REP may continue to read the information in the database file through the read reading interface and send the information to the slave REP (i.e. if the end of the file is not read, the read file data is continued and sent, as shown in fig. 5), until the master REP reads the end-of-file data information identifying the end of the database file, and sends the end-of-file data information to the slave REP. After receiving the end-of-file data information, the slave REP writes the identification data that completes the transmission of the database file in the third database through the write interface and closes the previous handle (i.e., the write file data and closing the file handle shown in fig. 5), and then sends a response to the master REP that the writing was successful from the slave REP, which indicates that the database file has been transferred from the master REP to the slave REP and that the slave REP has written it into the third database.
In the case of a large database file, the database file may be segmented into segments (e.g., the database file is segmented at a fixed size of 4M). Then the previous transmission is completed with only one of the database files. In this case, the master REP may insert relevant file information into the control_db for saving, and the relevant file information may include a database file name, a portion that has been transferred, an identifier of whether the transfer has been completed, etc., so as to facilitate the transfer of the subsequent pieces of database files.
Waiting for all files comprising the segments to be transferred (i.e. waiting for all files to be transferred as shown in fig. 5), the master REP sends a message to the slave REP that the transfer of the database file in snapdiff db is completed (i.e. snapdiff db is transferred as shown in fig. 5). Optionally, in the case where the slave REP can open the received database file (i.e., the punch snapdiff_db shown in fig. 5), a message informing success is sent to the master REP, indicating that the metadata difference information has confirmed that all transmission is completed.
The present disclosure provides an apparatus for data transmission in addition to the above-described method for data transmission, and the above description of the method for data transmission is equally applicable to an apparatus for data transmission which will be described below unless explicitly stated otherwise.
Fig. 6 shows a block diagram of an apparatus 600 for data transmission according to an embodiment of the disclosure. Fig. 7 shows a block diagram of an apparatus 700 for data transmission according to an embodiment of the disclosure.
Referring to fig. 6, the apparatus 600 may include an acquisition unit 610, a first storage unit 620, and a transmission unit 630.
According to an embodiment of the present disclosure, the acquisition unit 610 may be configured to acquire metadata difference information generated based on data update.
As an example, based on the above-described new addition, modification, or deletion of data in the file system, corresponding metadata difference information is generated. For example, when data in the file system is deleted, metadata corresponding to the data is also deleted, and metadata difference information generated at this time includes the above-described deleted metadata information. The metadata difference information may be embodied in the form of KV key pairs, for example, KV key pairs with deletion identifiers, modification identifiers, or newly added identifiers.
As an example, the metadata difference information may be acquired in advance according to actual conditions, or may be acquired in real time.
According to an embodiment of the present disclosure, the acquiring unit 610 may be configured to: obtaining snapshot information of a first file system storing the data from at least one metadata service; and acquiring the metadata difference information based on the snapshot information.
As an example, multiple metadata services may be distributed in a primary cluster. Each of the plurality of metadata services may have stored therein snapshot information of a portion of the first file system (i.e., the file system described above in connection with fig. 1), which may include the snapshot directory entry entries and snapshot inode entries described in connection with fig. 1. For example, a snapshot, such as snapshot 1 corresponding to the start time and snapshot 2 corresponding to the end time, may be created at the start time and the end time, respectively, within one period of the master cluster and the slave cluster (the period being a period defined in advance according to the actual situation), and then metadata difference information may be acquired based on comparison of snapshot 1 and snapshot 2. For example, in the case where data in the first file system is deleted, the data still exists in snapshot 1 and the data does not already exist in snapshot 2, the metadata difference information obtained may include information related to the metadata of the deleted data.
According to an embodiment of the present disclosure, the first storage unit 620 may be configured to store metadata difference information as a database file that the first database is capable of storing, wherein the database file is smaller in amount than the metadata difference information.
As an example, the first database may be any suitable database, such as a high performance Key Value (KV) persistence based database, such as a Rocksdb based database. The metadata difference information acquired in step S210 is stored in the database, and the resulting database file is generally much smaller than the data size of the metadata difference information. This is because the metadata difference information is compressed and stored in the database, so that the metadata difference information is transmitted faster than the conventional metadata difference information which is directly transmitted later, and more network resources are saved.
According to an embodiment of the present disclosure, the transmission unit 630 may be configured to transmit the database file to the second device.
As an example, the second device may be any device in the slave cluster, e.g. a device in the slave cluster running a master service.
As an example, the database file may be transmitted to the second device by any wired or wireless means according to actual needs.
According to an embodiment of the present disclosure, the apparatus 600 further includes: a first creation unit configured to: creating a second file system for storing metadata difference information; the first database is created based on the created second file system.
By way of example, the second file system may be any suitable lightweight file system that can interact with the database described above, such as a BlueFS-based file system. The second file system may be running on top of the first file system. The first database may be created on top of the second file system and then metadata difference information may be stored in the first database.
According to an embodiment of the present disclosure, the transmission unit 630 may be configured to: acquiring directory structure information and database files of the first database in the second file system; transmitting the directory structure information to a second device; and transmitting the database file to the second device after transmitting the directory structure information.
As an example, the first database may be a database file stored in the second file system in various file forms including directory structure information of various directory file configuration files, installation directory files, and the like of the first database, and data actually stored in the database (e.g., metadata difference information described above). After transmitting the directory structure information to the second device, the second device may build a database on the second device that is identical to the first database based on the directory structure information. The first device then transmits the database file to the second device after transmitting the directory structure information, and the second device may store metadata difference information included in the database file in the same database as the first database constructed in the second device based on the database file, thereby causing the second device to obtain the metadata difference information.
According to an embodiment of the present disclosure, the apparatus 600 further includes: a second creation unit configured to create the second database based on the created second file system, wherein the second database is used for storing interrupt information of metadata difference information; and a second storage unit configured to store, when an interruption occurs in the process of acquiring the metadata difference information, interruption information of the metadata difference information in the second database.
As an example, the second database may be any suitable database, such as a high performance Key Value (KV) based persistent database. The interruption may be caused by various reasons, for example by a temporary malfunction of the hardware or software of the first device, and for example by the need to temporarily stop the transmission of the metadata difference information, as a result of a higher priority service being invoked. When an interruption occurs in the process of acquiring the metadata difference information, the interruption information is stored in the second database, wherein the interruption information can comprise information related to convenience for continuous transmission of the metadata difference information, such as whether the metadata difference information is transmitted completely when the interruption occurs, metadata difference identification of an interruption point (such as offset relative to initial transmission of the metadata difference information), on which metadata service the interruption occurs, and the like. The method provided by the disclosure has the breakpoint retransmission function, and further has a wide application prospect.
According to an embodiment of the present disclosure, the transmission unit 630 may be configured to: transmitting a portion of the database file to the second device each time and storing information relating to the transmission of the portion in the second database.
As an example, asynchronous remote replication between master and slave clusters may take days or months due to reality. In the above case, the database file stored in S220 is often relatively large, and once transmitting the database file in this case easily brings about a great adjustment to the service on the device of the master cluster or the slave cluster. To this end, the method provided by the present disclosure further provides a breakpoint resume mechanism, i.e., a large database file may be divided into several segments (e.g., divided in a fixed size of 4M) in advance, and then only one segment (i.e., a portion of the database file) is transmitted to the second device at a time and information related to the transmission of the one segment is stored in the second database, where the information related to the transmission of the one segment may include information related to a database file name, a portion that has been transmitted, an identification of whether the transmission has been completed, and so on, which are convenient to continue with the remaining portion of the database file. Under the breakpoint resume mechanism, only a portion of the database file may be transmitted at a time until the database file is transmitted. This makes the method provided by the present disclosure more efficient and convenient than existing methods.
Referring to fig. 7, the apparatus 700 may include a first receiving unit 710, a creating unit 720, a second receiving unit 730, and a storing unit 740.
According to an embodiment of the present disclosure, the first receiving unit 710 may be configured to receive directory structure information of a first database in a pre-created file system, wherein the first database is a database on the first device for storing metadata difference information, wherein the metadata difference information is generated based on a data update.
As an example, the first database may be the first database described in connection with fig. 2. The pre-created file system may be the second file system described in connection with fig. 2. The directory structure information may be the directory structure information described in connection with fig. 2. The first device may be the first device described in connection with fig. 2. The metadata difference information may be the metadata difference information described in connection with fig. 2. Their details are not described here in detail.
According to an embodiment of the present disclosure, the creating unit 720 may be configured to create a third database based on the directory structure information.
As an example, the third database may be any suitable database, such as a high performance Key Value (KV) persistence based database, such as a Rocksdb based database.
The third database created based on the received directory structure information is consistent with the first database running on the first device. This allows the master cluster and the slave cluster to have the same database environment.
According to an embodiment of the present disclosure, the second receiving unit 730 may be configured to receive a database file from the first device, wherein the database file is obtained by storing the metadata difference information in the first database. The storage unit 740 may be configured to store the metadata difference information in the third database based on the database file.
As an example, the third database created in the second device may parse out the metadata difference information stored in the database file, which may then be stored in the third database of the second device. Finally, the second device can apply the metadata difference information in the second device, so that the metadata of the first device and the metadata of the second device are consistent. At this time, the data corresponding to the metadata can be kept consistent between the first device and the second device, thereby enabling asynchronous remote copying.
The method and apparatus for data transmission provided by the present disclosure have been described in detail above in connection with fig. 1-7. The method and the device provided by the disclosure can store the acquired metadata difference information as the database file which can be stored in the database, the data volume of the database file is smaller than the data volume of the metadata difference information, and then the database file is sent to one or more devices in the slave cluster, so that the method and the device provided by the disclosure can send all metadata difference information with lower data volume. In addition, the method and the device provided by the disclosure also have breakpoint transmission capability, so that the method and the device provided by the disclosure have the advantages of better efficiency and convenience, wider application prospect and the like compared with the prior art.
In addition to the above-described method and apparatus for data transmission, the present disclosure also provides an apparatus for data transmission, which will be described below with reference to fig. 8. The above description of the method and apparatus for data transmission applies equally to the devices described below, unless explicitly stated otherwise.
Fig. 8 is a block diagram illustrating an apparatus 800 for data transmission according to an embodiment of the present disclosure.
Referring to fig. 8, a device 800 may include a processor 801 and a memory 802. The processor 801 and the memory 802 may both be connected via a bus 803.
The processor 801 may perform various actions and processes according to programs stored in the memory 802. In particular, the processor 801 may be an integrated circuit chip with signal processing capabilities. The processor may be a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components. The disclosed methods, steps, and logic blocks in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like, and may be of the X86 architecture or ARM architecture.
The memory 802 stores computer instructions that, when executed by the processor 801, implement the methods for data transfer described above as being performed by the first device or the second device. The memory 802 may be volatile memory or nonvolatile memory, or may include both volatile and nonvolatile memory. The non-volatile memory may be read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), or flash memory. Volatile memory can be Random Access Memory (RAM), which acts as external cache memory. By way of example, and not limitation, many forms of RAM are available, such as Static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), synchronous Dynamic Random Access Memory (SDRAM), double data rate synchronous dynamic random access memory (ddr SDRAM), enhanced Synchronous Dynamic Random Access Memory (ESDRAM), synchronous Link Dynamic Random Access Memory (SLDRAM), and direct memory bus random access memory (DR RAM). It should be noted that the memory of the methods described herein is intended to comprise, without being limited to, these and any other suitable types of memory.
The present disclosure also provides another apparatus for data transmission, comprising: means for acquiring metadata difference information, wherein the metadata difference information is generated based on a data update; means for storing metadata difference information as database files that a first database is capable of storing, wherein the database files are smaller in amount than the metadata difference information; and means for transmitting the database file to a second device.
The present disclosure also provides another apparatus for data transmission, comprising: means for receiving directory structure information of a first database in a pre-created file system, wherein the first database is a database on a first device for storing metadata difference information, wherein the metadata difference information is generated based on a data update; means for creating a third database based on the directory structure information; means for receiving a database file from the first device, wherein the database file is obtained by storing the metadata difference information in a first database; and means for storing the metadata difference information in the third database based on the database file.
According to still another embodiment of the present disclosure, there is also provided a computer-readable recording medium. Fig. 9 shows a schematic diagram of a recording medium 900 according to the present disclosure.
As shown in fig. 9, the recording medium 900 has stored thereon computer-executable instructions 910. When executed by a processor, the computer-executable instructions 910 may perform a method according to embodiments of the present disclosure described with reference to the above figures. The computer readable recording medium in the embodiments of the present disclosure may be a volatile memory or a nonvolatile memory, or may include both volatile and nonvolatile memories. The non-volatile memory may be read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), or flash memory. Volatile memory can be Random Access Memory (RAM), which acts as external cache memory. By way of example, and not limitation, many forms of RAM are available, such as Static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), synchronous Dynamic Random Access Memory (SDRAM), double data rate synchronous dynamic random access memory (ddr SDRAM), enhanced Synchronous Dynamic Random Access Memory (ESDRAM), synchronous Link Dynamic Random Access Memory (SLDRAM), and direct memory bus random access memory (DR RAM). It should be noted that the memory of the methods described herein is intended to comprise, without being limited to, these and any other suitable types of memory.
It is noted that the flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In general, the various example embodiments of the disclosure may be implemented in hardware or special purpose circuits, software, firmware, logic, or any combination thereof. Some aspects may be implemented in hardware, while other aspects may be implemented in firmware or software which may be executed by a controller, microprocessor or other computing device. While aspects of the embodiments of the present disclosure are illustrated or described as block diagrams, flow charts, or using some other pictorial representation, it is well understood that the blocks, apparatus, systems, techniques or methods described herein may be implemented in, as non-limiting examples, hardware, software, firmware, special purpose circuits or logic, general purpose hardware or controller or other computing devices, or some combination thereof.
The exemplary embodiments of the invention described in detail above are illustrative only and are not limiting. It will be appreciated by those skilled in the art that various modifications and combinations of the embodiments or features thereof can be made without departing from the principles and spirit of the invention, and such modifications are intended to be within the scope of the invention.
In the above detailed description, it can be seen that different features are combined in an example. This manner of disclosure should not be interpreted as an intention that the example clauses have more features than are expressly recited in each clause. Conversely, various aspects of the disclosure may comprise fewer than all of the features of a single disclosed example clause. Accordingly, the following clauses are to be considered as included in the specification, each of which may itself be considered a separate example. Although each subordinate clause may refer to a particular combination with one of the other clauses in these clauses, aspects of the subordinate clause are not limited to a particular combination. It should be understood that other example clauses may also include combinations of subordinate clause aspects with the subject matter of any other subordinate clause or independent clause, or combinations of any feature with other subordinate clause and independent clause. Various aspects disclosed herein expressly include such combinations unless expressly stated or it can be readily inferred that a particular combination is not intended (e.g., contradictory aspects such as defining an element as being both an insulator and a conductor). Furthermore, it is also intended that aspects of a clause may be included in any other independent clause even if the clause is not directly subordinate to the independent clause.
The following numbered clauses describe an implementation example:
clause 1. A method for data transmission, the method being performed by a first device and the method comprising: acquiring metadata difference information, wherein the metadata difference information is generated based on data update; storing metadata difference information as database files which can be stored in a first database, wherein the database files are smaller than the metadata difference information in quantity; transmitting the database file to a second device.
Clause 2. The method of clause 1, wherein the obtaining metadata difference information comprises: obtaining snapshot information of a first file system storing the data from at least one metadata service; and acquiring the metadata difference information based on the snapshot information.
Clause 3 the method of any of clauses 1-2, further comprising: creating a second file system for storing metadata difference information; the first database is created based on the created second file system.
The method of any of clauses 1 to 3, wherein the transmitting the database file to the second device comprises: acquiring directory structure information and database files of the first database in the second file system; transmitting the directory structure information to a second device; and transmitting the database file to the second device after transmitting the directory structure information.
Clause 5 the method of any of clauses 1 to 4, further comprising: creating a second database based on the created second file system, wherein the second database is used for storing interrupt information of metadata difference information; and when interruption occurs in the process of acquiring the metadata difference information, storing the interruption information of the metadata difference information in the second database.
The method of any one of clauses 1 to 5, wherein the transmitting the database file to the second device comprises: transmitting a portion of the database file to the second device each time and storing information relating to the transmission of the portion in the second database.
Clause 7. A method for data transmission, the method being performed by a second device and the method comprising: receiving directory structure information of a first database in a pre-created file system, wherein the first database is a database on a first device for storing metadata difference information, wherein the metadata difference information is generated based on data update; creating a third database based on the directory structure information; receiving a database file from the first device, wherein the database file is obtained by storing the metadata difference information in a first database; the metadata difference information is stored in the third database based on the database file.
Clause 8. An apparatus for data transmission includes: an acquisition unit configured to acquire metadata difference information, wherein the metadata difference information is generated based on data update; a first storage unit configured to store metadata difference information as a database file that a first database can store, wherein the database file is smaller in amount than the metadata difference information; a transmission unit configured to transmit the database file to the second device.
The apparatus of clause 9, 8, wherein the acquisition unit is configured to: obtaining snapshot information of a first file system storing the data from at least one metadata service; and acquiring the metadata difference information based on the snapshot information.
The apparatus of any one of clauses 8 to 9, further comprising: a first creation unit configured to: creating a second file system for storing metadata difference information; the first database is created based on the created second file system.
The apparatus of any one of clauses 8 to 10, wherein the transmission unit is configured to: acquiring directory structure information and database files of the first database in the second file system; transmitting the directory structure information to a second device; and transmitting the database file to the second device after transmitting the directory structure information.
The apparatus of any one of clauses 8 to 11, further comprising: a second creation unit configured to create the second database based on the created second file system, wherein the second database is used for storing interrupt information of metadata difference information; and a second storage unit configured to store, when an interruption occurs in the process of acquiring the metadata difference information, interruption information of the metadata difference information in the second database.
The apparatus of any one of clauses 8 to 12, wherein the transmission unit is configured to: transmitting a portion of the database file to the second device each time and storing information relating to the transmission of the portion in the second database.
Clause 14. An apparatus for data transmission, comprising: a first receiving unit configured to receive directory structure information of a first database in a file system created in advance, wherein the first database is a database on a first device for storing metadata difference information, wherein the metadata difference information is generated based on data update; a creation unit configured to create a third database based on the directory structure information; a second receiving unit configured to receive a database file from the first device, wherein the database file is obtained by storing the metadata difference information in a first database; a storage unit configured to store the metadata difference information in the third database based on the database file.
Clause 15. An apparatus for data transmission, comprising: means for acquiring metadata difference information, wherein the metadata difference information is generated based on a data update; means for storing metadata difference information as database files that a first database is capable of storing, wherein the database files are smaller in amount than the metadata difference information; and means for transmitting the database file to a second device.
Clause 16, an apparatus for data transmission, comprising: means for receiving directory structure information of a first database in a pre-created file system, wherein the first database is a database on a first device for storing metadata difference information, wherein the metadata difference information is generated based on a data update; means for creating a third database based on the directory structure information; means for receiving a database file from the first device, wherein the database file is obtained by storing the metadata difference information in a first database; and means for storing the metadata difference information in the third database based on the database file.
Clause 17, an apparatus for data transmission, comprising: a processor, and a memory storing computer-executable instructions that, when executed by the processor, cause the processor to perform the method of any of the preceding clauses.
Clause 18 is a computer readable recording medium storing computer executable instructions, wherein the computer executable instructions when executed by a processor cause the processor to perform the method of any of the preceding clauses.

Claims (14)

1. A method for data transmission, the method performed by a first device, and the method comprising:
acquiring metadata difference information, wherein the metadata difference information is generated based on data update;
storing metadata difference information as a database file which can be stored in a first database, wherein the database file is smaller than the data volume of the metadata difference information;
transmitting the database file to a second device,
further comprises:
creating a second file system for storing metadata difference information;
creating the first database based on the created second file system;
Wherein said transmitting said database file to a second device comprises:
acquiring directory structure information and database files of the first database in the second file system;
transmitting the directory structure information to a second device, wherein the directory structure information is used for the second device to create a third database for storing the metadata difference information;
and transmitting the database file to the second device after transmitting the directory structure information.
2. The method of claim 1, wherein the obtaining metadata difference information comprises:
obtaining snapshot information of a first file system storing the data from at least one metadata service;
and acquiring the metadata difference information based on the snapshot information.
3. The method of claim 1, further comprising:
creating a second database based on the created second file system, wherein the second database is used for storing interrupt information of metadata difference information;
and when interruption occurs in the process of acquiring the metadata difference information, storing the interruption information of the metadata difference information in the second database.
4. The method of claim 3, wherein the transmitting the database file to a second device comprises:
Transmitting a portion of the database file to the second device each time and storing information relating to the transmission of the portion in the second database.
5. A method for data transmission, the method performed by a second device, and the method comprising:
receiving directory structure information of a first database in a pre-created second file system, wherein the first database is a database on a first device for storing metadata difference information, wherein the metadata difference information is generated based on data update;
creating a third database based on the directory structure information;
receiving a database file from the first device, wherein the database file is obtained by storing the metadata difference information in a first database, and wherein the database file is smaller than the data amount of the metadata difference information;
storing the metadata difference information in the third database based on the database file,
wherein the first database is created based on a second file system created by the first device for storing metadata difference information.
6. An apparatus for data transmission, comprising:
An acquisition unit configured to acquire metadata difference information, wherein the metadata difference information is generated based on data update;
a first storage unit configured to store metadata difference information as a database file that a first database can store, wherein the database file is smaller than a data amount of the metadata difference information;
a transmission unit configured to transmit the database file to a second device,
further comprises:
a first creation unit configured to:
creating a second file system for storing metadata difference information;
creating the first database based on the created second file system;
the transmission unit is configured to:
acquiring directory structure information and database files of the first database in the second file system;
transmitting the directory structure information to a second device, wherein the directory structure information is used for the second device to create a third database for storing the metadata difference information;
and transmitting the database file to the second device after transmitting the directory structure information.
7. The apparatus of claim 6, wherein the acquisition unit is configured to:
Obtaining snapshot information of a first file system storing the data from at least one metadata service;
and acquiring the metadata difference information based on the snapshot information.
8. The apparatus of claim 6, further comprising:
a second creation unit configured to create a second database based on the created second file system, wherein the second database is used for storing interrupt information of metadata difference information;
and a second storage unit configured to store, when an interruption occurs in the process of acquiring the metadata difference information, interruption information of the metadata difference information in the second database.
9. The apparatus of claim 8, wherein the transmission unit is configured to:
transmitting a portion of the database file to the second device each time and storing information relating to the transmission of the portion in the second database.
10. An apparatus for data transmission, comprising:
a first receiving unit configured to receive directory structure information of a first database in a second file system created in advance, wherein the first database is a database on a first device for storing metadata difference information, wherein the metadata difference information is generated based on data update;
A creation unit configured to create a third database based on the directory structure information;
a second receiving unit configured to receive a database file from the first device, wherein the database file is obtained by storing the metadata difference information in a first database, wherein the database file is smaller than a data amount of the metadata difference information;
a storage unit configured to store the metadata difference information in the third database based on the database file,
wherein the first database is created based on a second file system created by the first device for storing metadata difference information.
11. An apparatus for data transmission, comprising:
means for acquiring metadata difference information, wherein the metadata difference information is generated based on a data update;
means for storing metadata difference information as a database file that a first database can store, wherein the database file is smaller in data amount than the metadata difference information;
means for transmitting the database file to a second device,
further comprises:
means for creating a second file system for storing metadata difference information;
Means for creating the first database based on the created second file system;
wherein the means for transmitting the database file to the second device comprises:
means for obtaining directory structure information and database files of the first database in the second file system;
means for transmitting the directory structure information to a second device, wherein the directory structure information is used by the second device to create a third database for storing the metadata difference information;
and means for transmitting the database file to the second device after transmitting the directory structure information.
12. An apparatus for data transmission, comprising:
means for receiving directory structure information of a first database in a pre-created second file system, wherein the first database is a database on a first device for storing metadata difference information, wherein the metadata difference information is generated based on a data update;
means for creating a third database based on the directory structure information;
means for receiving a database file from the first device, wherein the database file is obtained by storing the metadata difference information in a first database, wherein the database file is smaller than the data amount of the metadata difference information;
Means for storing the metadata difference information in the third database based on the database file,
wherein the first database is created based on a second file system created by the first device for storing metadata difference information.
13. An apparatus for data transmission, comprising:
a processor, and
a memory storing computer-executable instructions that, when executed by a processor, cause the processor to perform the method of any of claims 1-5.
14. A computer readable recording medium storing computer executable instructions, wherein the computer executable instructions when executed by a processor cause the processor to perform the method of any one of claims 1-5.
CN202310934180.XA 2023-07-28 2023-07-28 Method, device, equipment and medium for data transmission Active CN116708420B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310934180.XA CN116708420B (en) 2023-07-28 2023-07-28 Method, device, equipment and medium for data transmission

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310934180.XA CN116708420B (en) 2023-07-28 2023-07-28 Method, device, equipment and medium for data transmission

Publications (2)

Publication Number Publication Date
CN116708420A CN116708420A (en) 2023-09-05
CN116708420B true CN116708420B (en) 2023-11-03

Family

ID=87834276

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310934180.XA Active CN116708420B (en) 2023-07-28 2023-07-28 Method, device, equipment and medium for data transmission

Country Status (1)

Country Link
CN (1) CN116708420B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106095871A (en) * 2016-06-06 2016-11-09 无锡天脉聚源传媒科技有限公司 A kind of method and device setting up data base directory structure
CN108153612A (en) * 2016-12-02 2018-06-12 航天星图科技(北京)有限公司 A kind of backup method of database file
CN111857534A (en) * 2019-04-24 2020-10-30 北京嘀嘀无限科技发展有限公司 Data transmission method, data storage server and data storage system
CN112835874A (en) * 2021-03-25 2021-05-25 中国工商银行股份有限公司 Method, device and system for building main and standby databases
CN113486372A (en) * 2021-07-05 2021-10-08 优车库网络科技发展(深圳)有限公司 Data backup method, data backup device and server
CN115185891A (en) * 2022-09-14 2022-10-14 联想凌拓科技有限公司 Data management method and device of file system, electronic equipment and storage medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080086483A1 (en) * 2006-10-10 2008-04-10 Postech Academy-Industry Foundation File service system in personal area network
US20150269032A1 (en) * 2014-03-18 2015-09-24 Netapp, Inc. Backing up data to cloud data storage while maintaining storage efficiency

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106095871A (en) * 2016-06-06 2016-11-09 无锡天脉聚源传媒科技有限公司 A kind of method and device setting up data base directory structure
CN108153612A (en) * 2016-12-02 2018-06-12 航天星图科技(北京)有限公司 A kind of backup method of database file
CN111857534A (en) * 2019-04-24 2020-10-30 北京嘀嘀无限科技发展有限公司 Data transmission method, data storage server and data storage system
CN112835874A (en) * 2021-03-25 2021-05-25 中国工商银行股份有限公司 Method, device and system for building main and standby databases
CN113486372A (en) * 2021-07-05 2021-10-08 优车库网络科技发展(深圳)有限公司 Data backup method, data backup device and server
CN115185891A (en) * 2022-09-14 2022-10-14 联想凌拓科技有限公司 Data management method and device of file system, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN116708420A (en) 2023-09-05

Similar Documents

Publication Publication Date Title
US11693880B2 (en) Component-based synchronization of digital assets
US10248660B2 (en) Mechanism for converting one type of mirror to another type of mirror on a storage system without transferring data
US9946716B2 (en) Distributed file system snapshot
JP5661188B2 (en) File system and data processing method
US7720801B2 (en) System and method for supporting asynchronous data replication with very short update intervals
CN103473277B (en) The Snapshot Method and device of file system
US20150339314A1 (en) Compaction mechanism for file system
US11567792B2 (en) Deploying a cloud instance of a user virtual machine
CN114356651A (en) System and method for automatic cloud-based full data backup and restore on mobile devices
CN111078667B (en) Data migration method and related device
CN111125021B (en) Method and system for efficiently restoring consistent views of file system images from an asynchronous remote system
CN114968966A (en) Distributed metadata remote asynchronous replication method, device and equipment
CN113254394B (en) Snapshot processing method, system, equipment and storage medium
CN116708420B (en) Method, device, equipment and medium for data transmission
CN116401220A (en) File system data recovery method, device, equipment and medium
WO2023178899A1 (en) Data management method and apparatus of file system, electronic device, and storage medium
CN109964217B (en) Duplicate data removing device and method
CN113535482B (en) Cloud backup chain data backup method and device, equipment and readable medium
CN111581015B (en) Continuous data protection system and method for modern application
CN111522688B (en) Data backup method and device for distributed system
CN108874592B (en) Data cold standby method and system for Log-structured storage engine
CN113076298A (en) Distributed small file storage system
CN111581029A (en) Remote backup method and system for hard link file
CN116909490B (en) Data processing method, device, storage system and computer readable storage medium
US11966414B2 (en) Synchronization of components of digital assets during live co-editing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40094571

Country of ref document: HK