CN118227572A - Metadata change data reporting method and device, storage medium and electronic device - Google Patents

Metadata change data reporting method and device, storage medium and electronic device Download PDF

Info

Publication number
CN118227572A
CN118227572A CN202410424502.0A CN202410424502A CN118227572A CN 118227572 A CN118227572 A CN 118227572A CN 202410424502 A CN202410424502 A CN 202410424502A CN 118227572 A CN118227572 A CN 118227572A
Authority
CN
China
Prior art keywords
metadata
change data
file
reconciliation
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410424502.0A
Other languages
Chinese (zh)
Inventor
王传义
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jinan Inspur Data Technology Co Ltd
Original Assignee
Jinan Inspur Data Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jinan Inspur Data Technology Co Ltd filed Critical Jinan Inspur Data Technology Co Ltd
Priority to CN202410424502.0A priority Critical patent/CN118227572A/en
Publication of CN118227572A publication Critical patent/CN118227572A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/16File or folder operations, e.g. details of user interfaces specifically adapted to file systems
    • G06F16/164File meta data generation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/14Details of searching files based on file metadata
    • G06F16/148File search processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/1805Append-only file systems, e.g. using logs or journals to store data
    • G06F16/1815Journaling file systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Library & Information Science (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the application provides a method and a device for reporting metadata change data, a storage medium and an electronic device, wherein the method for reporting the metadata change data comprises the following steps: generating metadata change data under the condition that the metadata is detected to change, and storing the metadata change data into a metadata log; generating a synchronous log asynchronously according to the metadata change data under the condition that the metadata log is detected to execute a persistence operation, wherein the persistence operation is used for persistence of the metadata change data; and reporting the metadata change data in the queue to be processed of the synchronous log to a target server according to a first preset period, so as to report the metadata change data to a force storage platform through the target server, wherein the force storage platform is used for managing the metadata change data.

Description

Metadata change data reporting method and device, storage medium and electronic device
Technical Field
The embodiment of the application relates to the field of computers, in particular to a method and a device for reporting metadata change data, a storage medium and an electronic device.
Background
At present, a solution is provided for the storage requirement of massive unstructured data, and the solution can adopt different storage modes to store the data on storage devices with different performances according to indexes such as importance, access frequency, retention time, capacity, performance and the like of the data; self- Mai migration of data objects between storage devices can be achieved through hierarchical storage management.
However, there are differences in computing power of different data centers, and rows of unified scheduling is required. Particularly in the mode of "east-west memory" (hot spot area data calculation, west preparation storage), memory integration becomes a trend. However, a technical scheme capable of uniformly scheduling the data of different data centers has not been proposed in the prior art.
Aiming at the technical problem that the prior technical scheme lacks a scheme capable of uniformly scheduling the data of different data centers in the related art, no effective solution is provided.
Disclosure of Invention
The embodiment of the application provides a method and a device for reporting metadata change data, a storage medium and an electronic device, which at least solve the problem that the prior technical scheme lacks a scheme capable of uniformly scheduling data of different data centers in the related art.
According to an embodiment of the present application, there is provided a method for reporting metadata change data, including: generating metadata change data under the condition that the metadata is detected to change, and storing the metadata change data into a metadata log; generating a synchronous log asynchronously according to the metadata change data under the condition that the metadata log is detected to execute a persistence operation, wherein the persistence operation is used for persistence of the metadata change data; and reporting the metadata change data in the queue to be processed of the synchronous log to a target server according to a first preset period, so as to report the metadata change data to a force storage platform through the target server, wherein the force storage platform manages the metadata change data by using a file.
In an exemplary embodiment, in a case that the metadata log is detected to perform a persistence operation, generating a synchronous log asynchronously according to the metadata change data includes: determining whether the persistence operation is successfully executed and whether the synchronization log is successfully generated; under the condition that the persistent operation is determined to be failed to be executed, determining that the synchronous log is failed to be generated, and determining that the metadata change data record is failed; under the condition that the generation of the synchronous log is determined to be failed, determining that the execution of the persistence operation is failed, and determining that the metadata change data record is failed; and under the condition that the persistent operation is successfully executed and the synchronous log is successfully generated, determining that the metadata change data record is successful.
In an exemplary embodiment, the reporting, according to a first preset period, the plurality of metadata change data in the queue to be processed of the synchronization log to a target server, so that the target server may report the plurality of metadata change data to a power storage platform, includes: determining a target partition in a plurality of partitions of the target server according to the data sources of the metadata change data; reporting the metadata change data to the target partition according to the queue sequence indicated by the queue to be processed; and respectively reporting the metadata change data in the multiple partitions to multiple metadata databases of the force storage platform according to a second preset period by the target server, wherein the multiple partitions do not correspond to the multiple metadata databases one by one.
In an exemplary embodiment, after the reporting, according to the first preset period, the plurality of metadata change data in the to-be-processed queue of the synchronization log to the target server, so that the target server reports the plurality of metadata change data to the power storage platform, the method further includes: receiving a checking request of the force storage platform, wherein the checking request uses a secret key to check whether metadata change data received by the force storage platform is correct or not; and responding to the reconciliation request, and sending a reconciliation file corresponding to the reconciliation request to the force storage platform so as to reconcile through the reconciliation file rows.
In an exemplary embodiment, the responding to the reconciliation request sends a reconciliation file corresponding to the reconciliation request to the force storage platform to reconcile the reconciliation through the reconciliation file rows, including: inquiring the metadata lines under the target directory indicated by the path information according to the path information carried by the reconciliation request to obtain the reconciliation file, wherein the reconciliation file comprises one of the following: metadata under the target directory, and a data list corresponding to the metadata under the target directory; sending the reconciliation file to the force storage platform, indicating the force storage platform to screen out target files corresponding to the reconciliation file through a retrieval system, and performing line reconciliation on the target files according to the reconciliation file, wherein the target files store metadata change data stored by the force storage platform by using a file box; under the condition that a reconciliation ending message sent by the force storage platform is received, acquiring a reconciliation difference file from the force storage platform, wherein the reconciliation difference file uses a file for indicating target metadata of the reconciliation file, which is different from the target file; determining a difference category corresponding to the target metadata according to the keywords of the reconciliation difference file, wherein the difference category comprises: the data values of the metadata are different, and the metadata does not exist; determining a repairing mode of the target metadata according to the difference category, and uploading a difference repairing message to the force storage platform according to the repairing mode so as to repair the target metadata rows; and checking the repaired target metadata lines under the condition that the uploading of the difference repair message is determined to be completed.
In an exemplary embodiment, the screening, by the search system, the target file that does not correspond to the reconciliation file includes one of the following: recursively traversing the force storage platform in a multithreading mode to screen out the target file; and screening the target file through the metadata index carried by the reconciliation file.
In an exemplary embodiment, before the asynchronous generation of the synchronous log from the metadata change data, the method further comprises: and executing the persistence operation on a preset number of metadata change data in the metadata log according to a third preset period, wherein the preset number of metadata change data is metadata change data with the front time generated in the metadata log.
According to another embodiment of the present application, there is provided a reporting apparatus for metadata change data, characterized by comprising: the storage module is used for generating metadata change data under the condition that metadata change is detected by using the given, and storing the metadata change data into a metadata log; the generation module is used for asynchronously generating a synchronous log according to the metadata change data under the condition that the persistence operation is detected by using the given, wherein the persistence operation is used for persistence of the metadata change data; and the reporting module is used for reporting the metadata change data in the queue to be processed of the synchronous log to a target server according to a first preset period by using a user, so that the metadata change data are reported to a force storage platform by the target server, wherein the force storage platform uses the user to manage the metadata change data.
According to a further embodiment of the application, there is also provided a computer readable storage medium having a computer program stored therein, wherein the computer program is arranged to perform the steps of one of the method embodiments of above when run.
According to yet another embodiment of the present application, there is also provided an electronic device comprising a memory and a processor, said memory having stored therein a computer program, said processor being arranged to run said computer program to perform the steps of one of the method embodiments described above at .
According to yet another embodiment of the present application, there is also provided a computer program product comprising a computer program which, when executed by a processor, implements the steps of the method described in the various embodiments of the application.
According to the method and the device, when the change of the metadata is detected, metadata change data are generated and stored in a metadata log; when the metadata log is detected to execute the persistence operation, generating a synchronous log asynchronously according to metadata change data for executing the persistence operation, and finally reporting a plurality of metadata change data in a to-be-processed queue of the synchronous log to a target server according to a first preset period so as to report the metadata change data to a force storage platform through the target server, wherein the force storage platform manages the metadata change data by a file; by adopting the scheme, the metadata change data is detected, generated and reported to the power storage platform in real time, so that the power storage of different data centers is uniformly managed and scheduled through the power storage platform; to solve the problem that the existing technical scheme lacks a scheme capable of uniformly scheduling the data of different data centers.
Drawings
Fig. 1 is a hardware block diagram of a computer terminal of a method for reporting metadata change data according to an embodiment of the present application;
FIG. 2 is a flow chart of a method of reporting metadata change data according to an embodiment of the present application;
FIG. 3 is a schematic diagram of an alternative data transfer relationship in accordance with an embodiment of the present application;
FIG. 4 is a flow chart of an alternative method for reporting metadata change data according to an embodiment of the present application;
FIG. 5 is a flow diagram of an alternative metadata reconciliation method in accordance with an embodiment of the application;
Fig. 6 is a block diagram of a reporting apparatus of metadata change data according to an embodiment of the present application.
Detailed Description
Embodiments of the present application will be described in detail below with reference to the accompanying drawings in conjunction with the embodiments.
It should be noted that the terms surprised "first", "second", etc. in the present specification, the claims and the above figures are used for distinguishing between similar objects by a given element, and do not necessarily describe a particular order or sequence by a given element.
The method embodiments provided in the embodiments of the present application may be executed in a computer terminal or similar computing device. Taking the operation on a computer terminal as an example, fig. 1 is a hardware block diagram of a computer terminal of a method for reporting metadata change data according to an embodiment of the present application. As shown in fig. 1, the computer terminal may include one or more (only one is shown in fig. 1) processors 102 (the processor 102 may include, but is not limited to, a microprocessor MCU or a processing device such as a programmable memory device FPGA) and a memory 104 for storing data, where the computer terminal may further include a transmission device 106 for communicating functions and an input/output device 108. It will be appreciated by those skilled in the art that the configuration shown in fig. 1 is merely illustrative and is not intended to limit the configuration of the computer terminal described above. For example, the computer terminal may also include more or fewer components than shown in FIG. 1, or have a different configuration than shown in FIG. 1.
The memory 104 may store a computer program, for example, a software program of application software and a module, such as a computer program corresponding to a method for reporting metadata change data in an embodiment of the present application, and the processor 102 executes the computer program stored in the memory 104 to perform various functional applications and data processing, that is, to implement the above-mentioned method. Memory 104 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, memory 104 may comprise steps of memory set up with respect to the processes of processor 102, which may be connected to a computer terminal via a network. Examples of such networks include, but are not limited to, the Internet, an Intranet, a local area network, Mai communications networks, and combinations thereof.
Transmission device 106 receives or transmits data via a network with the insert. The specific example of the network described above may include a wireless network provided by a communication provider of a computer terminal. In one example, the transmission device 106 includes a network adapter (Network Interface Controller, simply referred to as a NIC) that may be connected to other network devices via a base station so that communications may not be performed over the internet . In one example, the transmission device 106 may be a Radio Frequency (RF) module that communicates wirelessly over the internet .
The following description of the present application is presented in conjunction with the operation surprised :
Distributed file system: distributed FILE SYSTEM refers to a file system in which the physical storage resources managed by the file system are not directly connected to a local node, but are not connected to a node through a computer network;
NFS: NFS is a network file system, which allows a system to share directory files on a network without others;
POSIX: portable Operating SYSTEM INTERFACE ( may be an operating system interface) and X indicates its inheritance to Unix APIs; POSIX is a generic name for a series of interrelated API standards defined by IEEE for software to be run on various UNIX operating systems, formally known as IEEE 1003, and International Standard name ISO/IEC 9945;
Storage clusters: a storage cluster formed by a plurality of physical storage nodes;
And (3) intelligent storage: the distributed file storage (traditional) and the intelligent file storage (with lower pricing) are the intelligent storage which is realized by grading the warm and cold data and transferring capacity by depending on a storage intelligent operation platform, and the intelligent file storage is built by the inferior distributed file storage at the bottom;
DLM: DATA LIFE CYCLE MANAGEMENT, data lifecycle management;
MDS: a distributed storage management metadata node;
kafka: is an open source distributed event stream Platform (EVENT STREAMING Platform);
force storage: the distributed system storage capability comprises information such as storage pools, file systems, shared paths, file directories and the like.
In this embodiment, a method for reporting metadata change data is provided, and fig. 2 is a flowchart of a method for reporting metadata change data according to an embodiment of the present application, as shown in fig. 2, where the flowchart includes the following steps:
Step S202, generating metadata change data and storing the metadata change data into a metadata log under the condition that the metadata change is detected;
Step S204, under the condition that the metadata log is detected to execute the persistence operation, generating a synchronous log asynchronously according to the metadata change data, wherein the persistence operation is used for persistence of the metadata change data;
Step S206, reporting the metadata change data in the queue to be processed of the synchronous log to a target server according to a first preset period, so as to report the metadata change data to a force storage platform through the target server, wherein the force storage platform manages the metadata change data by using a file.
Through the steps, when the change of the metadata is detected, metadata change data are generated and stored in a metadata log; when the metadata log is detected to execute the persistence operation, generating a synchronous log asynchronously according to metadata change data for executing the persistence operation, and finally reporting a plurality of metadata change data in a to-be-processed queue of the synchronous log to a target server according to a first preset period so as to report the metadata change data to a force storage platform through the target server, wherein the force storage platform manages the metadata change data by a file; by adopting the scheme, the metadata change data is detected, generated and reported to the power storage platform in real time, so that the power storage of different data centers is uniformly managed and scheduled through the power storage platform; to solve the problem that the existing technical scheme lacks a scheme capable of uniformly scheduling the data of different data centers.
The main body of the steps may be a computer terminal, but is not limited thereto.
The execution sequence of step S202 and step S204 may be interchanged, i.e. step S204 may be executed first and then step S202 may be executed.
In an exemplary embodiment, in a case that the metadata log is detected to perform a persistence operation, generating a synchronous log asynchronously according to the metadata change data includes: determining whether the persistence operation is successfully executed and whether the synchronization log is successfully generated; under the condition that the persistent operation is determined to be failed to be executed, determining that the synchronous log is failed to be generated, and determining that the metadata change data record is failed; under the condition that the generation of the synchronous log is determined to be failed, determining that the execution of the persistence operation is failed, and determining that the metadata change data record is failed; and under the condition that the persistent operation is successfully executed and the synchronous log is successfully generated, determining that the metadata change data record is successful.
Metadata changes are recorded in metadata logs of the MDS, a synchronization log of metadata reporting is generated when the metadata logs are persistent, and 3 processes of generating the synchronization log according to operation type metadata persistence are provided:
1. Generating a synchronous log when the metadata log dir is complete;
2. the deletion is a single process, and a synchronous log is generated during deletion;
3. The hierarchy changes are identified in the renaming process, so renaming the hierarchy changes produces a synchronized log.
The method includes that after metadata change, a force storage platform is not immediately reported, and is recorded to a synchronous log at the same time when metadata is persistence, the metadata persistence is performed by recording the synchronous log concurrently, and a mode that success is to be achieved and failure is to be achieved is adopted, so that consistency of metadata is maintained. Therefore, whether the executed persistence operation and the process of generating the synchronous log are successful or not needs to be determined respectively, if one side fails, the other side also fails to be determined directly, and the execution process is terminated; if both are performed successfully, it is determined that this process is complete.
Through the embodiment, the consistency of metadata reporting is ensured.
Optionally, the reporting the metadata change data in the queue to be processed of the synchronization log to the target server according to the first preset period, so as to report the metadata change data to the force storage platform through the target server includes: determining a target partition in a plurality of partitions of the target server according to the data sources of the metadata change data; reporting the metadata change data to the target partition according to the queue sequence indicated by the queue to be processed; and respectively reporting the metadata change data in the multiple partitions to multiple metadata databases of the force storage platform according to a second preset period by the target server, wherein the multiple partitions do not correspond to the multiple metadata databases one by one.
The metadata is synchronized to the force storage platform, the bottom interactive protocol uses the kafka protocol, the kafka server comprises a plurality of partitions, for convenience in managing the metadata change data , the metadata change data of the same data source is reported to the same partition (namely the target partition) in the kafka server, the ordering of the metadata change data is required to be ensured, the metadata change data is required to be reported to the target partition according to the queue sequence of a queue to be processed, and then the kafka server sequentially reports the metadata change data of a plurality of partitions stored in a plurality of data centers (namely a plurality of data sources) to a plurality of metadata databases of the force storage platform according to a second preset period.
Specific examples are shown in fig. 3, fig. 3 shows a data transmission relationship between non-resource pools (i.e., the MDS) of a power scheduling platform (i.e., the power scheduling platform), and for the given resource pool, different resource pools are set up in different regions, such as Su Chi and call pools in fig. 3; correspondingly, in the power-saving dispatching platform, independent metadata databases, such as Su Chiyuan databases, zheng Chiyuan databases and the like in fig. 3, are respectively set up for different regions, and the resource pools do not correspond to the databases one by one, so that the power-saving dispatching platform can effectively manage the power of the data centers of different regions according to the metadata reported by different regions.
As shown in fig. 4, after the client (client in fig. 4) performs the change operation on the metadata, the change of the metadata is recorded in the metadata log of the MDS (i.e. MDLog in fig. 4), and the change operation includes, but is not limited to: creation, modification, writing, Mai, deletion, etc.; the synchronization log for metadata reporting is generated while the metadata log is persisted, and then the synchronization log reports a plurality of items in the pending queue (i.e., the metadata change data described above) to the kafka server, , which reports the plurality of items to a force platform (not shown in the figure) through the kafka server.
It should be noted that, the basic unit of the synchronization log is items, each item records metadata of a file, metadata of different periods of a file is recorded by a plurality of items, and the latest metadata item is behind the old item, thereby ensuring the operation time sequence problem.
Based on the above steps, the method further includes, after reporting, according to a first preset period, a plurality of metadata change data in a queue to be processed of the synchronization log to a target server, so that the target server reports the plurality of metadata change data to a force storage platform, the method further includes: receiving a checking request of the force storage platform, wherein the checking request uses a secret key to check whether metadata change data received by the force storage platform is correct or not; and responding to the reconciliation request, and sending a reconciliation file corresponding to the reconciliation request to the force storage platform so as to reconcile through the reconciliation file rows.
The metadata change data can undergo a plurality of data transmission nodes in the process of being reported to the force storage platform, and problems of transmission errors jest, such as lost data, data transmission errors jest and the like, can occur in the process of transmitting the data transmission nodes; in order to ensure that the metadata change data stored by the force storage platform is correct, lines of checking are needed after the metadata change data is reported; the MDS (storage platform) firstly receives the account checking request of the storage platform, then sends the corresponding account checking file to the storage platform, and cooperates with the storage platform to complete account checking so as to ensure the correctness of the storage data of the storage platform, and to ensure the correctness of the scheduling decision of the storage platform on the storage capacity.
Optionally, the responding to the reconciliation request sends a reconciliation file corresponding to the reconciliation request to the force storage platform, so as to reconcile the accounts through the reconciliation file rows, including: inquiring the metadata lines under the target directory indicated by the path information according to the path information carried by the reconciliation request to obtain the reconciliation file, wherein the reconciliation file comprises one of the following: metadata under the target directory, and a data list corresponding to the metadata under the target directory; sending the reconciliation file to the force storage platform, indicating the force storage platform to screen out target files corresponding to the reconciliation file through a retrieval system, and performing line reconciliation on the target files according to the reconciliation file, wherein the target files store metadata change data stored by the force storage platform by using a file box; under the condition that a reconciliation ending message sent by the force storage platform is received, acquiring a reconciliation difference file from the force storage platform, wherein the reconciliation difference file uses a file for indicating target metadata of the reconciliation file, which is different from the target file; determining a difference category corresponding to the target metadata according to the keywords of the reconciliation difference file, wherein the difference category comprises: the data values of the metadata are different, and the metadata does not exist; determining a repairing mode of the target metadata according to the difference category, and uploading a difference repairing message to the force storage platform according to the repairing mode so as to repair the target metadata rows; and checking the repaired target metadata lines under the condition that the uploading of the difference repair message is determined to be completed.
Firstly, inquiring metadata under a target directory indicated by path information according to the path information carried by the reconciliation request to obtain a reconciliation file, wherein the data content contained in the reconciliation file can be all metadata stored under the target directory or a data list of the metadata stored under the target directory; sending the reconciliation file to a force storage platform, indicating the force storage platform to screen out files meeting the requirements of the reconciliation message (namely the target files) through a metadata retrieval system (namely the retrieval system), and then finishing reconciliation according to the reconciliation file, namely the target files; after the storage manufacturer finishes uploading the account checking file, uploading an account checking completion mark file to indicate that the storage platform can acquire the target file; the format of the mark file is as follows: store cluster flag-file system flag-metadata type-timestamp (RECONCILE _ TIMESTAMP in the take reconciliation message) -dz.dat.success. After checking, the force storage platform performs subsequent buffer editing processing according to the mark file, lines of subsequent buffer editing processing, finally generates a difference checking file (namely the checking difference file) in a checking end mark file, and then notifies metadata lines of confirmation; the storage manufacturer collects the reconciliation difference file according to the reconciliation ending mark file (or receives the reconciliation completion notification message) to prepare for subsequent metadata restoration, wherein the format of the reconciliation file which is not generated by the reconciliation difference file is basically consistent and is distinguished in a first row of field list, the first field is (diff_type), Y or N marks (namely the keywords) are added before the original record in the content row, and the content row is divided from' to @; wherein Y represents that metadata exists on two sides (storage manufacturer's data storage platform), but the key values are different (based on metadata on the storage manufacturer side), N represents that metadata does not exist, and the metadata does not exist on the storage platform or the storage manufacturer side; and storing the difference category indicated in the checking difference file by the manufacturer base, retransmitting the message to the power storage platform, and after uploading, carrying out secondary verification on the metadata lines corresponding to the difference repair message so as to ensure that errors cannot occur again.
Through the embodiment, after the metadata change data is reported, the metadata change data is checked by a strict checking mode through the buffer and editing, and when the difference data is found, lines are repaired in time, so that the timely effectiveness of the metadata change data stored by the power storage platform is ensured.
Alternatively, the metadata reconciliation method may be implemented by a flow as shown in fig. 5, and specifically includes the following steps:
Step 1, receiving reconciliation messages: the intelligent power storage scheduling platform (namely the power storage platform) detects whether the uploading metadata of different heterogeneous nodes is correct, initiates a request to the metadata of a specified directory, and the storage platform (namely the distributed file system in fig. 5) receives a specified subject message;
Step 2, uploading a reconciliation document: after the account checking file of the storage manufacturer is generated, the account checking file is sent to the general server according to a convention protocol (SFTP), so that transmission data is not lost, and files meeting the requirements of account checking information are screened out through a metadata retrieval system; among them, the implementation of the metadata retrieval system is 2: recursively traversing the data to obtain the data by a multithreading stat mode, or providing metadata indexes by the MDS itself;
Step 3, uploading a reconciliation completion flag file: after the account checking file is uploaded by a storage manufacturer, uploading a flag file finally to indicate that a downstream file can be acquired; the mark file has the following format: storage cluster identity-file system identity-metadata type-timestamp (RECONCILE _ TIMESTAMP in the accounting message) -dz.dat.success;
Step 4, receiving a reconciliation ending message: the intelligent force-storing operation platform performs subsequent buffer editing processing on lines according to the mark file to finally generate a difference account checking file and an account checking finishing mark file; then informing metadata of line confirmation in a subject message mode;
step 5, collecting a difference file: the storage manufacturer collects the reconciliation difference file according to the reconciliation ending flag file (or the reconciliation completion notification message is received) to prepare for subsequent metadata repair; the reconciliation file format which is not generated by the format of the difference file is basically consistent, and is distinguished in that in the step of performing step, in the first row of field list, the first field is (DIFF_TYPE), Y or N marks are added before the original record in the content row, and the Y marks are divided from ' to @ ' to ' and represent that the metadata of two sides exist, but the key values are different (the record is the metadata of the storage side), N represents that the metadata does not exist, and the storage capacity or the storage side does not exist;
step 6, uploading a difference repair message: the storage manufacturer collects the reconciliation difference file according to the existence of the reconciliation ending mark file (or the receipt of the reconciliation completion notification message); and the base chu reconciles the difference types in the difference file and resends the message. Wherein rows of secondary checks are required for the chun difference data.
Optionally, the screening, by the search system, the target file that does not correspond to the reconciliation file includes one of the following: recursively traversing the force storage platform in a multithreading mode to screen out the target file; and screening the target file through the metadata index carried by the reconciliation file.
The metadata retrieval system filters the target file in the following 2 implementation modes: the retrieval is performed by recursive traversal in a multithreaded stat fashion, or the MDS itself provides the metadata index for retrieval by the metadata retrieval system .
Optionally, before the asynchronous generating the synchronous log according to the metadata change data, the method further includes: and executing the persistence operation on a preset number of metadata change data in the metadata log according to a third preset period, wherein the preset number of metadata change data is metadata change data with the front time generated in the metadata log.
In order to persist MDLOG MDS metadata and reduce the number of disk operations, a certain number of lines are accumulated to persist to disk and trigger reporting metadata, in order to avoid delayed reporting of important data change, the embodiment of the application starts parameter control, triggers a fixed number of MDLOG logs (i.e. the preset number) every 5 seconds (i.e. the third preset period is used as an example only) instead of starting lines reporting when the maximum value is accumulated in the prior art.
Through the embodiment, after the metadata of the file system is changed, the changed content is sent to the kafka-gask-size force storage platform in a short time, so that timeliness of data reporting is guaranteed.
From the above description of the embodiments, it will be clear to those skilled in the art that the method according to the above embodiment may be implemented by using software and a necessary general purpose hardware platform, and certainly may also be implemented by using hardware, but is a preferred embodiment in many cases. Based on this understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) comprising several instructions for causing a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the method according to the embodiments of the present application.
In this embodiment, a reporting device for metadata change data is further provided, where the device uses a given to implement the foregoing embodiment and the preferred implementation manner, and the description of the device is already and is not repeated. As used below, the term surprised "module" may implement a combination of software, a, or a hardware of predetermined functions. While the apparatus described in the following embodiments is preferably implemented in software, implementation of hardware, or a combination of software and firmware, is also possible and contemplated.
Fig. 6 is a block diagram of a reporting apparatus for metadata change data according to an embodiment of the present application, as shown in fig. 6, the apparatus includes:
a storage module 62 for generating metadata change data when a change in metadata is detected, and storing the metadata change data in a metadata log;
Generating module 64 for asynchronously generating a synchronization log from the metadata change data with a given if it is detected that the metadata log performs a persistence operation that persists the metadata change data with the given;
And a reporting module 66, configured to report, with a given, a plurality of metadata change data in a queue to be processed of the synchronization log to a target server according to a first preset period, so as to report, by the target server, the plurality of metadata change data to a power storage platform, where the power storage platform manages, with the given, the plurality of metadata change data.
By the device, when the metadata is detected to change, metadata change data are generated and stored in a metadata log; when the metadata log is detected to execute the persistence operation, generating a synchronous log asynchronously according to metadata change data for executing the persistence operation, and finally reporting a plurality of metadata change data in a to-be-processed queue of the synchronous log to a target server according to a first preset period so as to report the metadata change data to a force storage platform through the target server, wherein the force storage platform manages the metadata change data by a file; by adopting the scheme, the metadata change data is detected, generated and reported to the power storage platform in real time, so that the power storage of different data centers is uniformly managed and scheduled through the power storage platform; to solve the problem that the existing technical scheme lacks a scheme capable of uniformly scheduling the data of different data centers.
Optionally, the generating module 64 further uses a given case to determine whether the persistence operation is successfully performed and whether the synchronization log is successfully generated; under the condition that the persistent operation is determined to be failed to be executed, determining that the synchronous log is failed to be generated, and determining that the metadata change data record is failed; under the condition that the generation of the synchronous log is determined to be failed, determining that the execution of the persistence operation is failed, and determining that the metadata change data record is failed; and under the condition that the persistent operation is successfully executed and the synchronous log is successfully generated, determining that the metadata change data record is successful.
Metadata changes are recorded in metadata logs of the MDS, a synchronization log of metadata reporting is generated when the metadata logs are persistent, and 3 processes of generating the synchronization log according to operation type metadata persistence are provided:
1. Generating a synchronous log when the metadata log dir is complete;
2. the deletion is a single process, and a synchronous log is generated during deletion;
3. The hierarchy changes are identified in the renaming process, so renaming the hierarchy changes produces a synchronized log.
The method includes that after metadata change, a force storage platform is not immediately reported, and is recorded to a synchronous log at the same time when metadata is persistence, the metadata persistence is performed by recording the synchronous log concurrently, and a mode that success is to be achieved and failure is to be achieved is adopted, so that consistency of metadata is maintained. Therefore, whether the executed persistence operation and the process of generating the synchronous log are successful or not needs to be determined respectively, if one side fails, the other side also fails to be determined directly, and the execution process is terminated; if both are performed successfully, it is determined that this process is complete.
Through the embodiment, the consistency of metadata reporting is ensured.
Optionally, the reporting module 66 further determines a target partition from the multiple partitions of the target server according to the data sources of the metadata change data by using a set of boxes; reporting the metadata change data to the target partition according to the queue sequence indicated by the queue to be processed; and respectively reporting the metadata change data in the multiple partitions to multiple metadata databases of the force storage platform according to a second preset period by the target server, wherein the multiple partitions do not correspond to the multiple metadata databases one by one.
The metadata is synchronized to the force storage platform, the bottom interactive protocol uses the kafka protocol, the kafka server comprises a plurality of partitions, for convenience in managing the metadata change data , the metadata change data of the same data source is reported to the same partition (namely the target partition) in the kafka server, the ordering of the metadata change data is required to be ensured, the metadata change data is required to be reported to the target partition according to the queue sequence of a queue to be processed, and then the kafka server sequentially reports the metadata change data of a plurality of partitions stored in a plurality of data centers (namely a plurality of data sources) to a plurality of metadata databases of the force storage platform according to a second preset period.
Specific examples are shown in fig. 3, fig. 3 shows a data transmission relationship between non-resource pools (i.e., the MDS) of a power scheduling platform (i.e., the power scheduling platform), and for the given resource pool, different resource pools are set up in different regions, such as Su Chi and call pools in fig. 3; correspondingly, in the power-saving dispatching platform, independent metadata databases, such as Su Chiyuan databases, zheng Chiyuan databases and the like in fig. 3, are respectively set up for different regions, and the resource pools do not correspond to the databases one by one, so that the power-saving dispatching platform can effectively manage the power of the data centers of different regions according to the metadata reported by different regions.
Optionally, the reporting module 66 further uses a given box to receive a checking request of the force storage platform, where the checking request uses the given box to check whether metadata change data received by the force storage platform is correct; and responding to the reconciliation request, and sending a reconciliation file corresponding to the reconciliation request to the force storage platform so as to reconcile through the reconciliation file rows.
The metadata change data can undergo a plurality of data transmission nodes in the process of being reported to the force storage platform, and problems of transmission errors jest, such as lost data, data transmission errors jest and the like, can occur in the process of transmitting the data transmission nodes; in order to ensure that the metadata change data stored by the force storage platform is correct, lines of checking are needed after the metadata change data is reported; the MDS (storage platform) firstly receives the account checking request of the storage platform, then sends the corresponding account checking file to the storage platform, and cooperates with the storage platform to complete account checking so as to ensure the correctness of the storage data of the storage platform, and to ensure the correctness of the scheduling decision of the storage platform on the storage capacity.
Optionally, the reporting module 66 further uses a given file to query the metadata line under the target directory indicated by the path information according to the path information carried by the reconciliation request to obtain the reconciliation file, where the reconciliation file includes one of the following: metadata under the target directory, and a data list corresponding to the metadata under the target directory; sending the reconciliation file to the force storage platform, indicating the force storage platform to screen out target files corresponding to the reconciliation file through a retrieval system, and performing line reconciliation on the target files according to the reconciliation file, wherein the target files store metadata change data stored by the force storage platform by using a file box; under the condition that a reconciliation ending message sent by the force storage platform is received, acquiring a reconciliation difference file from the force storage platform, wherein the reconciliation difference file uses a file for indicating target metadata of the reconciliation file, which is different from the target file; determining a difference category corresponding to the target metadata according to the keywords of the reconciliation difference file, wherein the difference category comprises: the data values of the metadata are different, and the metadata does not exist; determining a repairing mode of the target metadata according to the difference category, and uploading a difference repairing message to the force storage platform according to the repairing mode so as to repair the target metadata rows; and checking the repaired target metadata lines under the condition that the uploading of the difference repair message is determined to be completed.
Firstly, inquiring metadata under a target directory indicated by path information according to the path information carried by the reconciliation request to obtain a reconciliation file, wherein the data content contained in the reconciliation file can be all metadata stored under the target directory or a data list of the metadata stored under the target directory; sending the reconciliation file to a force storage platform, indicating the force storage platform to screen out files meeting the requirements of the reconciliation message (namely the target files) through a metadata retrieval system (namely the retrieval system), and then finishing reconciliation according to the reconciliation file, namely the target files; after the storage manufacturer finishes uploading the account checking file, uploading an account checking completion mark file to indicate that the storage platform can acquire the target file; the format of the mark file is as follows: store cluster flag-file system flag-metadata type-timestamp (RECONCILE _ TIMESTAMP in the take reconciliation message) -dz.dat.success. After checking, the force storage platform performs subsequent buffer editing processing according to the mark file, lines of subsequent buffer editing processing, finally generates a difference checking file (namely the checking difference file) in a checking end mark file, and then notifies metadata lines of confirmation; the storage manufacturer collects the reconciliation difference file according to the reconciliation ending mark file (or receives the reconciliation completion notification message) to prepare for subsequent metadata restoration, wherein the format of the reconciliation file which is not generated by the reconciliation difference file is basically consistent and is distinguished in a first row of field list, the first field is (diff_type), Y or N marks (namely the keywords) are added before the original record in the content row, and the content row is divided from' to @; wherein Y represents that metadata exists on two sides (storage manufacturer's data storage platform), but the key values are different (based on metadata on the storage manufacturer side), N represents that metadata does not exist, and the metadata does not exist on the storage platform or the storage manufacturer side; and storing the difference category indicated in the checking difference file by the manufacturer base, retransmitting the message to the power storage platform, and after uploading, carrying out secondary verification on the metadata lines corresponding to the difference repair message so as to ensure that errors cannot occur again.
Through the embodiment, after the metadata change data is reported, the metadata change data is checked by a strict checking mode through the buffer and editing, and when the difference data is found, lines are repaired in time, so that the timely effectiveness of the metadata change data stored by the power storage platform is ensured.
Optionally, the screening the target file corresponding to the reconciliation file by the search system includes one of the following steps: recursively traversing the force storage platform in a multithreading mode to screen out the target file; and screening the target file through the metadata index carried by the reconciliation file.
The metadata retrieval system filters the target file in the following 2 implementation modes: the retrieval is performed by recursive traversal in a multithreaded stat fashion, or the MDS itself provides the metadata index for retrieval by the metadata retrieval system .
Optionally, the generating module 64 further executes the persistence operation on a preset number of metadata change data in the metadata log with a given file according to a third preset period, where the preset number of metadata change data is metadata change data with a time that is earlier than the time of generation in the metadata log.
In order to persist MDLOG MDS metadata and reduce the number of disk operations, a certain number of lines are accumulated to persist to disk and trigger reporting metadata, in order to avoid delayed reporting of important data change, the embodiment of the application starts parameter control, triggers a fixed number of MDLOG logs (i.e. the preset number) every 5 seconds (i.e. the third preset period is used as an example only) instead of starting lines reporting when the maximum value is accumulated in the prior art.
Through the embodiment, after the metadata of the file system is changed, the changed content is sent to the kafka-gask-size force storage platform in a short time, so that timeliness of data reporting is guaranteed.
It should be noted that each of the above modules may be implemented by software or hardware, and for the latter, it may be implemented by the following manner, but not limited to the following: the module is in the same processor; or the above modules are respectively arranged in the different processors in the form of combinations.
Embodiments of the present application also provide a computer readable storage medium having a computer program stored therein, wherein the computer program is configured to perform the steps of one of the method embodiments of above when run.
In one exemplary embodiment, the above-described computer-readable storage medium may include, but is not limited to, the following: a usb disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory RAM), a Mai hard disk, a magnetic disk, or an optical disk, or other various media capable of storing a computer program.
An embodiment of the application also provides an electronic device comprising a memory, a processor having stored therein a computer program arranged to run the computer program to perform the steps of one of the method embodiments described above at .
In an exemplary embodiment, the electronic device may further include a transmission device and an input/output device, wherein the transmission device is connected to the processor, and the input/output device is connected to the processor.
Embodiments of the present application also provide a computer program product comprising a non-transitory computer readable storage medium storing a computer program product which, when executed by a processor, implements the steps of the method described in the various embodiments of the application.
Specific examples in this embodiment may refer to the examples described in the foregoing embodiments and the exemplary implementation, and this embodiment is not described herein.
It will be appreciated by those skilled in the art that the modules or steps of the application described above may be implemented in a general purpose computing device, they may be collected on a single computing device, or distributed across a network of computing devices, they may be implemented in program code that is executable by computing devices, so that they may be stored in a memory device for execution by computing devices, and in some cases, the steps shown or described may be performed in a different order than that shown or described herein, or they may be separately fabricated into individual integrated circuit modules, or a plurality of modules or steps in them may be fabricated into a single integrated circuit module for implementation. Thus, the present application is not limited by what specific hardware, software, of the elements .
The above description is only of the preferred embodiments of the present application and is not intended to limit the application, but the present application is capable of various modifications and variations as will be apparent to those skilled in the art. What modifications, equivalent substitutions, changes , etc. are all within the principle of the present application should be included in the protection scope of the present application.

Claims (10)

1. A reporting method of metadata change data is characterized in that,
Comprising the following steps:
generating metadata change data under the condition that the metadata is detected to change, and storing the metadata change data into a metadata log;
Generating a synchronous log asynchronously according to the metadata change data under the condition that the metadata log is detected to execute a persistence operation, wherein the persistence operation is used for persistence of the metadata change data;
And reporting the metadata change data in the queue to be processed of the synchronous log to a target server according to a first preset period, so as to report the metadata change data to a force storage platform through the target server, wherein the force storage platform is used for managing the metadata change data.
2. The method of claim 1, wherein the step of determining the position of the substrate comprises,
And generating a synchronous log asynchronously according to the metadata change data under the condition that the metadata log is detected to execute the persistence operation, wherein the method comprises the following steps of:
Determining whether the persistence operation is successfully executed and whether the synchronization log is successfully generated;
Under the condition that the persistent operation is determined to be failed to be executed, determining that the synchronous log is failed to be generated, and determining that the metadata change data record is failed;
under the condition that the generation of the synchronous log is determined to be failed, determining that the execution of the persistence operation is failed, and determining that the metadata change data record is failed;
And determining that the metadata change data record is successful in the case that the persistence operation is successfully executed and the synchronous log generation is successful.
3. The method of claim 1, wherein the step of determining the position of the substrate comprises,
The step of reporting the metadata change data in the queue to be processed of the synchronization log to a target server according to a first preset period, so as to report the metadata change data to a force storage platform through the target server, includes:
determining a target partition in a plurality of partitions of the target server according to the data sources of the metadata change data;
Reporting the metadata change data to the target partition according to the queue sequence indicated by the queue to be processed;
and respectively reporting the metadata change data in the multiple partitions to multiple metadata databases of the force storage platform through the target server according to a second preset period, wherein the multiple partitions are in one-to-one correspondence with the multiple metadata databases.
4. The method of claim 1, wherein the step of determining the position of the substrate comprises,
The method further includes, after reporting, according to a first preset period, the plurality of metadata change data in the queue to be processed of the synchronization log to a target server, so that the target server reports the plurality of metadata change data to a force storage platform, the method further includes:
receiving a checking request of the force storage platform, wherein the checking request is used for checking whether metadata change data received by the force storage platform are correct or not;
And responding to the reconciliation request, and sending a reconciliation file corresponding to the reconciliation request to the force storage platform so as to reconcile the reconciliation through the reconciliation file.
5. The method of claim 4, wherein the step of determining the position of the first electrode is performed,
The response to the reconciliation request, the reconciliation file corresponding to the reconciliation request is sent to the force storage platform, so as to reconcile the reconciliation through the reconciliation file, including:
Inquiring metadata under a target directory indicated by the path information according to the path information carried by the reconciliation request to obtain the reconciliation file, wherein the reconciliation file comprises one of the following: metadata under the target directory, and a data list corresponding to the metadata under the target directory;
Sending the reconciliation file to the force storage platform, and indicating the force storage platform to screen out a target file corresponding to the reconciliation file through a retrieval system, and reconciling according to the reconciliation file and the target file, wherein the target file is used for storing metadata change data stored by the force storage platform;
under the condition that a reconciliation ending message sent by the force storage platform is received, acquiring a reconciliation difference file from the force storage platform, wherein the reconciliation difference file is used for indicating target metadata with difference between the reconciliation file and the target file;
Determining a difference category corresponding to the target metadata according to the keywords of the reconciliation difference file, wherein the difference category comprises: the data values of the metadata are different, and the metadata does not exist;
determining a repairing mode of the target metadata according to the difference category, and uploading a difference repairing message to the force storage platform according to the repairing mode so as to repair the target metadata;
And under the condition that the uploading of the difference repairing information is determined to be completed, verifying the repaired target metadata.
6. The method of claim 5, wherein the step of determining the position of the probe is performed,
The target file corresponding to the reconciliation file is screened out through a search system, and the target file comprises one of the following components:
recursively traversing the force storage platform in a multithreading mode to screen out the target file;
And screening the target file through the metadata index carried by the reconciliation file.
7. The method of claim 1, wherein the step of determining the position of the substrate comprises,
Before the asynchronous generation of the synchronous log from the metadata change data, the method further comprises:
And executing the persistence operation on a preset number of metadata change data in the metadata log according to a third preset period, wherein the preset number of metadata change data is metadata change data with the front time generated in the metadata log.
8. A reporting device of metadata change data is characterized in that,
Comprising the following steps:
the storage module is used for generating metadata change data and storing the metadata change data into a metadata log under the condition that the metadata change is detected;
The generation module is used for asynchronously generating a synchronous log according to the metadata change data under the condition that the metadata log is detected to execute a persistence operation, wherein the persistence operation is used for persistence of the metadata change data; and the reporting module is used for reporting the metadata change data in the queue to be processed of the synchronous log to a target server according to a first preset period so as to report the metadata change data to a force storage platform through the target server, wherein the force storage platform is used for managing the metadata change data.
9. A computer-readable storage medium comprising,
The computer readable storage medium has stored therein a computer program, wherein the computer program when executed by a processor realizes the steps of the method as claimed in any of claims 1 to 7.
10. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that,
The processor, when executing the computer program, implements the steps of the method as claimed in any one of claims 1 to 7.
CN202410424502.0A 2024-04-09 2024-04-09 Metadata change data reporting method and device, storage medium and electronic device Pending CN118227572A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410424502.0A CN118227572A (en) 2024-04-09 2024-04-09 Metadata change data reporting method and device, storage medium and electronic device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410424502.0A CN118227572A (en) 2024-04-09 2024-04-09 Metadata change data reporting method and device, storage medium and electronic device

Publications (1)

Publication Number Publication Date
CN118227572A true CN118227572A (en) 2024-06-21

Family

ID=91497801

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410424502.0A Pending CN118227572A (en) 2024-04-09 2024-04-09 Metadata change data reporting method and device, storage medium and electronic device

Country Status (1)

Country Link
CN (1) CN118227572A (en)

Similar Documents

Publication Publication Date Title
US11899684B2 (en) System and method for maintaining a master replica for reads and writes in a data store
US11388043B2 (en) System and method for data replication using a single master failover protocol
US10929240B2 (en) System and method for adjusting membership of a data replication group
US10248704B2 (en) System and method for log conflict detection and resolution in a data store
US20150120658A1 (en) System and method for splitting a replicated data partition
US11188423B2 (en) Data processing apparatus and method
EP1654683A1 (en) Automatic and dynamic provisioning of databases
CN112506870B (en) Data warehouse increment updating method and device and computer equipment
CN104584524A (en) Aggregating data in a mediation system
US20180018363A1 (en) Time series data processing method and apparatus
CN108228432A (en) A kind of distributed link tracking, analysis method and server, global scheduler
CN112650629B (en) Block chain index data recovery method, device, equipment and computer storage medium
CN112417050A (en) Data synchronization method and device, system, storage medium and electronic device
CN108829735B (en) Synchronization method, device, server and storage medium for parallel execution plan
CN118227572A (en) Metadata change data reporting method and device, storage medium and electronic device
CN104765748A (en) Method and device for converting copying table into slicing table
CN113468154A (en) MySQL-based large table volume reduction method and device, electronic device and storage medium
CN114691700A (en) Kafaka cluster-based intelligent park retrieval method
CN113032477A (en) Long-distance data synchronization method and device based on GTID and computing equipment
CN110677497A (en) Network medium distribution method and device
CN111143280B (en) Data scheduling method, system, device and storage medium
CN113472808B (en) Log processing method and device, storage medium and electronic device
CN103856359A (en) Method and system for obtaining information
CN112564953B (en) Method, device and equipment for managing remote equipment of office
CN115086414A (en) Message processing method and device, storage medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination