CN106919679B - Log replay method, device and terminal applied to distributed file system - Google Patents

Log replay method, device and terminal applied to distributed file system Download PDF

Info

Publication number
CN106919679B
CN106919679B CN201710109052.6A CN201710109052A CN106919679B CN 106919679 B CN106919679 B CN 106919679B CN 201710109052 A CN201710109052 A CN 201710109052A CN 106919679 B CN106919679 B CN 106919679B
Authority
CN
China
Prior art keywords
record
records
log
threads
replay
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710109052.6A
Other languages
Chinese (zh)
Other versions
CN106919679A (en
Inventor
张震
周应超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Xiaomi Mobile Software Co Ltd
Original Assignee
Beijing Xiaomi Mobile Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Xiaomi Mobile Software Co Ltd filed Critical Beijing Xiaomi Mobile Software Co Ltd
Priority to CN201710109052.6A priority Critical patent/CN106919679B/en
Publication of CN106919679A publication Critical patent/CN106919679A/en
Application granted granted Critical
Publication of CN106919679B publication Critical patent/CN106919679B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/1805Append-only file systems, e.g. using logs or journals to store data
    • G06F16/1815Journaling file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The disclosure relates to a log replay method, a log replay device and a terminal applied to a distributed file system, wherein the method comprises the following steps: dividing the records into at least one type of record according to the file identification of each record in the editing log, wherein each type of record is a record with a dependency relationship; determining the number of threads for replaying the log according to the class number of the divided records; and performing log replay on various records corresponding to the threads one by adopting the threads with the determined thread quantity. Therefore, the records can be classified according to the file path prefixes of the records in the editing log, and the dependency relationship of the records in each type of record can be ensured; furthermore, the records in the top of each record type can be replayed by adopting a plurality of threads, so that the replay speed of the log is increased, and the unavailable time of the distributed file system can be ensured to be shortened; also, the time-series relationship in each record in the edit log is not broken.

Description

log replay method, device and terminal applied to distributed file system
Technical Field
the present disclosure relates to the field of distributed file processing technologies, and in particular, to a log replay method and apparatus applied to a distributed file system, and a terminal.
background
Distributed file systems typically guarantee consistency of file system metadata through transactional mechanisms. The mechanism is usually implemented in a journal manner, that is, a modification operation on metadata of a file system is recorded to the inner header of an edit log (edit log) first, and then an actual (in-place) metadata modification operation is performed. This ensures that any modifications to the metadata of the file system are either all completed or nothing is done, thereby ensuring consistency between the file system metadata. In the process of system downtime, such as power failure, software error and the like, and later restart, records for file system operations in the editing log need to be newly in-place updated (re-do log) for records which exist in the editing log but are not modified or only partially modified, and the process is called recovery of the file system, namely log replay.
In the related art, there is often a dependency relationship between records in the edit log. For example, there are four records in the edit log (1) create file/tmp/file 1; (2) allocating a data block for the file/user/xyz/file; (3) modifying the read-write permission of the file/tmp/file 1; (4) the length and the last modification time of the file/user/xyz/file are set. Of these four records, record 3 depends on record 1 and record 4 depends on record 2, i.e., record 3 and record 4 must wait until the in-place update of record 1 and record 2 is complete before they can make their own in-place updates. Therefore, in the prior art, log replay is completed by adopting a single-thread mode, namely, the single-thread mode is adopted to sequentially update in-place for each record to complete log replay.
however, in the related art, a single thread sequentially performs in-place updating on each record to complete the log replay mode, and since the in-place updating is performed on each record once, the log replay speed is very slow, the log replay process is long, and the unavailable time of the distributed file system is prolonged.
disclosure of Invention
In order to overcome the problems in the related art, the disclosure provides a log replay method, a log replay device and a log replay terminal applied to a distributed file system, which are used for solving the problems that in the prior art, a single thread performs in-place updating on all records in sequence to complete the log replay, the log replay speed is very low due to the fact that in-place updating is performed on all records once, the log replay process is long, and the unavailable time of the distributed file system is prolonged.
According to a first aspect of the embodiments of the present disclosure, there is provided a log replay method applied to a distributed file system, including:
Dividing the records into at least one type of record according to the file identification of each record in the editing log, wherein each type of record is a record with a dependency relationship;
determining the number of threads for replaying the log according to the class number of the divided records;
And performing log replay on various records corresponding to the threads one by adopting the threads with the determined thread quantity.
Further, the dividing the records into at least one type of record according to the file identifier of each record in the editing log includes:
determining each synchronization point in the editing log according to each record in the editing log;
Determining each record between adjacent synchronization points as each record processed simultaneously;
And classifying the records determined to be processed simultaneously into at least one type of records according to the file identification of the records determined to be processed simultaneously.
Further, the determining the number of threads for replay of the log according to the class number of the divided records includes:
And determining the number of threads between each two adjacent synchronous points for each type of record between each two adjacent synchronous points.
further, the replaying of the log of each type of record corresponding to each thread one to one by using each thread of the determined thread number includes:
Repeating the following process until log replay is completed for all records in the edit log:
performing log replay on various records corresponding to each thread one to one by adopting threads with the number of the threads between adjacent synchronization points;
And performing log replay on the synchronization point.
further, the synchronization point is a rename operation in the edit log.
further, the file is identified as a file path prefix.
The technical scheme provided by the embodiment of the disclosure can have the following beneficial effects: dividing the records into at least one type of record according to the file identification of each record in the editing log, wherein each type of record is a record with a dependency relationship; determining the number of threads for replaying the log according to the class number of the divided records; and performing log replay on various records corresponding to the threads one by adopting the threads with the determined thread quantity. Therefore, the records can be classified according to the file path prefixes of the records in the editing log, and the dependency relationship of the records in each type of record can be ensured; furthermore, the records in the top of each record type can be replayed by adopting a plurality of threads, so that the replay speed of the log is increased, and the unavailable time of the distributed file system can be ensured to be shortened; also, the time-series relationship in each record in the edit log is not broken.
according to a second aspect of the embodiments of the present disclosure, there is provided a log replaying apparatus applied to a distributed file system, including:
the classification module is configured to classify the records into at least one type of record according to the file identification of each record in the editing log, wherein each type of record is a record with a dependency relationship;
a determining module configured to determine the number of threads for replay of the log according to the class number of the divided records;
and the replay module is configured to replay the logs of various records corresponding to the threads one by adopting the threads with the determined thread quantity.
Further, the classification module is specifically configured to:
Determining each synchronization point in the editing log according to each record in the editing log;
determining each record between adjacent synchronization points as each record processed simultaneously;
and classifying the records determined to be processed simultaneously into at least one type of records according to the file identification of the records determined to be processed simultaneously.
Further, the determining module is specifically configured to:
and determining the number of threads between each two adjacent synchronous points for each type of record between each two adjacent synchronous points.
Further, the replay module is specifically configured to:
Repeating the following process until log replay is completed for all records in the edit log:
performing log replay on various records corresponding to each thread one to one by adopting threads with the number of the threads between adjacent synchronization points;
And performing log replay on the synchronization point.
Further, the synchronization point is a rename operation in the edit log.
further, the file is identified as a file path prefix.
The technical scheme provided by the embodiment of the disclosure can have the following beneficial effects: dividing the records into at least one type of record according to the file identification of each record in the editing log, wherein each type of record is a record with a dependency relationship; determining the number of threads for replaying the log according to the class number of the divided records; and performing log replay on various records corresponding to the threads one by adopting the threads with the determined thread quantity. Therefore, the records can be classified according to the file path prefixes of the records in the editing log, and the dependency relationship of the records in each type of record can be ensured; furthermore, the records in the top of each record type can be replayed by adopting a plurality of threads, so that the replay speed of the log is increased, and the unavailable time of the distributed file system can be ensured to be shortened; also, the time-series relationship in each record in the edit log is not broken.
According to a third aspect of the embodiments of the present disclosure, there is provided a terminal, including:
A processor, and a memory for storing processor-executable instructions;
wherein the processor is configured to: dividing the records into at least one type of record according to the file identification of each record in the editing log, wherein each type of record is a record with a dependency relationship; determining the number of threads for replaying the log according to the class number of the divided records; and performing log replay on various records corresponding to the threads one by adopting the threads with the determined thread quantity.
the technical scheme provided by the embodiment of the disclosure can have the following beneficial effects: dividing the records into at least one type of record according to the file identification of each record in the editing log, wherein each type of record is a record with a dependency relationship; determining the number of threads for replaying the log according to the class number of the divided records; and performing log replay on various records corresponding to the threads one by adopting the threads with the determined thread quantity. Therefore, the records can be classified according to the file path prefixes of the records in the editing log, and the dependency relationship of the records in each type of record can be ensured; furthermore, the records in the top of each record type can be replayed by adopting a plurality of threads, so that the replay speed of the log is increased, and the unavailable time of the distributed file system can be ensured to be shortened; also, the time-series relationship in each record in the edit log is not broken.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.
FIG. 1 is a flowchart illustrating a first embodiment of a log replay method applied to a distributed file system, according to an illustrative embodiment;
FIG. 2 is a flowchart illustrating a second embodiment of a log replay method applied to a distributed file system according to an exemplary embodiment;
FIG. 3 is a block diagram illustrating a third embodiment of an application recommendation device, according to an illustrative embodiment;
FIG. 4 is a block diagram illustrating entities of a terminal in accordance with an exemplary embodiment;
Fig. 5 is a block diagram illustrating a terminal device 800 according to an example embodiment.
Detailed Description
reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present invention. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the invention, as detailed in the appended claims.
fig. 1 is a flowchart illustrating a first embodiment of a log replaying method applied to a distributed file system, as shown in fig. 1, in which the log replaying method applied to the distributed file system is used in a terminal, and the method includes the following steps.
In step S11, the records are classified into at least one type of record according to the file identifier of each record in the edit log, wherein each type of record is a record with a dependency relationship.
wherein the file identifier is a file path prefix.
in this step, a file identifier of each record in the editing log is determined, and the file identifier may adopt a file path prefix, so that each record may be classified into a plurality of types of records according to the file identifier of each record in the editing log, and each type of record is a record having a dependency relationship.
Specifically, each record of the edit log includes a record having a dependency relationship, and when log replay is performed by using a multi-thread concurrent log, a timing relationship of replay of each record item having a dependency relationship at a head of the edit log is also required to be ensured. Suppose there are four records at the head of edit log (1) create file/tmp/file 1; (2) allocating a data block for the file/user/xyz/file; (3) modifying the read-write permission of the file/tmp/file 1; (4) the length and the last modification time of the file/user/xyz/file are set. It can be known that the dependency relationship is that the replay record 3 is replayed after replaying the record 1, and the replay record 4 is replayed after replaying the record 2, so that when replaying the log, it is required to ensure that the replay record 3 is replayed after replaying the record 1, and it is also required to ensure that the replay record 4 is replayed after replaying the record 2. When a single thread is adopted, replay can be performed according to the sequence of record 1, record 2, record 3 and record 4, so that the two time sequence relations can be ensured. However, when log replay is performed by simply adopting two or more threads, if records 1 and 3 are allocated to two different threads for replay, there is no way to ensure the scheduling and running time sequence of the two threads, so there is no way to ensure that record 1 always replays before record 3, and there is no way to ensure that record 2 always replays before record 4.
therefore, the records with the same file path prefix have dependency relationship, the records in the editing log can be classified according to the file path prefix of each record, the logs with the same file path prefix are classified into the same type of records, the dependency relationship among the record types with different file path prefixes does not exist, and the records of different types can be replayed by using different threads.
For example, a log has such records at the beginning: (1) creating a catalog/user/a; (2) allocating data blocks to the file/tmp/file; (3) creating a file/user/a/file; (4) modifying the file length of the file/tmp/file; (5) distributing data blocks to the file/user/a/file; (6) modify the owner (owner) of the file/tmp/file; (7) the file length of the file/user/a/file is modified. The method comprises the steps that two file path prefixes are shared, namely, "/user/a" and "/tmp/file", records in the log can be divided into two types according to the two file path prefixes, the first type of records comprise records 1, 3, 5 and 7, the second type of records comprise records 2, 4 and 6, each record in each type of record has a dependency relationship, and the two types of records have no dependency relationship with each other.
in step S12, the number of threads for log replay is determined based on the class number of the divided records.
in this step, the number of threads for log replay can be determined from the class number of the divided records.
For example, a log has such records at the head: (1) creating a catalog/user/a; (2) allocating data blocks to the file/tmp/file; (3) creating a file/user/a/file; (4) modifying the file length of the file/tmp/file; (5) distributing data blocks to the file/user/a/file; (6) modify the owner (owner) of the file/tmp/file; (7) the file length of the file/user/a/file is modified. The method comprises the steps that two file path prefixes are shared, namely, "/user/a" and "/tmp/file", records in the log can be divided into two types according to the two file path prefixes, the first type of records comprise records 1, 3, 5 and 7, the second type of records comprise records 2, 4 and 6, each record in each type of record has a dependency relationship, and the two types of records have no dependency relationship with each other. Thus, the records are divided into two types, and the number of threads for log replay can be determined to be 2.
in step S13, the log replay is performed for each type of record corresponding one-to-one to each thread using each thread of the determined thread number.
In this step, each thread having the determined number of threads may be provided, and log replay is performed on each type of record corresponding to each thread one to one. In each thread, replaying the log replay log according to the sequence of occurrence of the records in the editing log, so that the time sequence of the records caused by the dependency relationship can be ensured.
For example, the records are divided into two types, the number of threads for log replay can be determined to be 2, and replay can be performed using two different threads. And sequentially replaying the records 1, 3, 5 and 7 in the first type of record by adopting a first thread, and simultaneously sequentially replaying the records 2, 4 and 6 in the inner head of the second type of record by adopting a second thread.
In the embodiment, the records are divided into at least one type of record according to the file identifier of each record in the editing log, wherein each type of record is a record with a dependency relationship; determining the number of threads for replaying the log according to the class number of the divided records; and performing log replay on various records corresponding to the threads one by adopting the threads with the determined thread quantity. Therefore, the records can be classified according to the file path prefixes of the records in the editing log, and the dependency relationship of the records in each type of record can be ensured; furthermore, the records in the top of each record type can be replayed by adopting a plurality of threads, so that the replay speed of the log is increased, and the unavailable time of the distributed file system can be ensured to be shortened; also, the time-series relationship in each record in the edit log is not broken.
Based on the embodiment shown in fig. 1, fig. 2 is a flowchart of a second embodiment of a log replaying method applied to a distributed file system according to an exemplary embodiment, as shown in fig. 2, step S11 specifically includes:
determining each synchronization point in the editing log according to each record in the editing log; determining each record between adjacent synchronization points as each record processed simultaneously; and classifying the records determined to be processed simultaneously into at least one type of records according to the file identification of the records determined to be processed simultaneously.
and the synchronization point is rename operation in the editing log.
in this step, each synchronization point in the editing log can be determined according to each record in the editing log; the synchronization point can be rename operation in the editing log; then, determining each record between adjacent synchronization points as each record processed simultaneously; further, the records determined to be processed simultaneously may be classified into at least one type of record based on the file identification of each record determined to be processed simultaneously.
specifically, in the rename operation of the file system, two file paths are involved, one is a source path and the other is a target path, so that the record containing the rename cannot be used simply in the processing scheme of the first embodiment. In this embodiment, the rename operation may be used as a synchronization point, that is, the record between two rename operations is replayed in the manner of the first embodiment, then the replay rename operation is replayed, and then the replay is replayed according to the manner of the next record. For example, assume that the edit log has the following records at the head: (1) creating a file/user/xyz/file; (2) creating a file/tmp/file; (3) rename/user/xyz/abc to/tmp/abc; (4) modifying/user/xyz/file access rights; (5) modify the owner of the file/tmp/file; (6) editing/user/xyz/file data; (7) rename/a/b/c to/d/e/f; (8) allocating new data blocks to the file/user/xyc/file; (9) delete file/tmp/file. Thus, the synchronization points can be identified as record 3 and record 7, records 1 and 2 can be identified as records that are processed simultaneously, records 4, 5 and 6 can be identified as records that are processed simultaneously, records 8 and 9 can be identified as records that are processed simultaneously, and then record 4 and record 6 can be classified into the same type of record based on the file path prefix of the records.
step S12, specifically including:
And determining the number of threads between each two adjacent synchronous points for each type of record between each two adjacent synchronous points.
In this step, the number of threads between each adjacent synchronization point may be determined for each type of record between each adjacent synchronization point.
for example, assume that the edit log has the following records at the head: (1) creating a file/user/xyz/file; (2) creating a file/tmp/file; (3) rename/user/xyz/abc to/tmp/abc; (4) modifying/user/xyz/file access rights; (5) modify the owner of the file/tmp/file; (6) editing/user/xyz/file data; (7) rename/a/b/c to/d/e/f; (8) allocating new data blocks to the file/user/xyc/file; (9) delete file/tmp/file. Thus, the synchronization points can be identified as record 3 and record 7, records 1 and 2 can be identified as records that are processed simultaneously, records 4, 5 and 6 can be identified as records that are processed simultaneously, records 8 and 9 can be identified as records that are processed simultaneously, and then record 4 and record 6 can be classified into the same type of record based on the file path prefix of the records. Then, it may be determined that for records 1, 2, two threads may be respectively employed to process records 1, 2; it may be determined that for record 3, record 3 may be processed with one thread; it may be determined that for records 4, 5, 6, one process record 4 and 6, respectively, and process record 5 of the other thread may be employed; it may be determined that for record 7, record 7 may be processed with one thread; it may be determined that for records 8, 9, two threads may be employed to process records 8, 9, respectively. So that 8 threads can be determined.
step S13, specifically including:
repeating the following process until log replay is completed for all records in the edit log: performing log replay on various records corresponding to each thread one to one by adopting threads with the number of the threads between adjacent synchronization points; and performing log replay on the synchronization point.
in the step, aiming at the threads with the number of the threads between the adjacent synchronization points, log replay is carried out on various records corresponding to the threads one by one; the synchronization points are then log replayed. And repeating the processes until all the records in the editing log are replayed.
for example, assume that the edit log has the following records at the head: (1) creating a file/user/xyz/file; (2) creating a file/tmp/file; (3) rename/user/xyz/abc to/tmp/abc; (4) modifying/user/xyz/file access rights; (5) modify the owner of the file/tmp/file; (6) editing/user/xyz/file data; (7) rename/a/b/c to/d/e/f; (8) allocating new data blocks to the file/user/xyc/file; (9) delete file/tmp/file. Thus, the synchronization points can be identified as record 3 and record 7, records 1 and 2 can be identified as records that are processed simultaneously, records 4, 5 and 6 can be identified as records that are processed simultaneously, records 8 and 9 can be identified as records that are processed simultaneously, and then record 4 and record 6 can be classified into the same type of record based on the file path prefix of the records. Then, it may be determined that for records 1, 2, two threads may be respectively employed to process records 1, 2; it may be determined that for record 3, record 3 may be processed with one thread; it may be determined that for records 4, 5, 6, one process record 4 and 6, respectively, and process record 5 of the other thread may be employed; it may be determined that for record 7, record 7 may be processed with one thread; it may be determined that for records 8, 9, two threads may be employed to process records 8, 9, respectively. So that 8 threads can be determined. Then, the replay process would be such that, in a first step, for records 1, 2, 3, since 3 is a rename operation, it is a synchronization point, so that for the records before 3, replay record 1 and record 2 would be replayed separately using two threads; a second step of replaying the replay record 3; thirdly, for the records 4, 5, 6 and 7, since 7 is a rename operation, it is a synchronization point, and the records 4 and 6 are the same type of record, and the record 5 is another type of record, so that for the record before the record 7, two threads are used to replay the replay records 4 and 5, respectively, and the other thread replays the replay record 6; a fourth step of replaying the playback record 7; the fifth step replays replay records 8 and 9, respectively, using two threads. The whole process has 5 steps, and if a single-thread mode is adopted, 9 steps are needed.
The embodiment determines each synchronization point in the editing log according to each record in the editing log; determining each record between adjacent synchronization points as each record processed simultaneously; and classifying the records determined to be processed simultaneously into at least one type of records according to the file identification of the records determined to be processed simultaneously. And determining the number of threads between each two adjacent synchronous points for each type of record between each two adjacent synchronous points. Repeating the following process until the log replay is completed for all records in the edit log: performing log replay on various records corresponding to each thread one to one by adopting threads with the number of the threads between adjacent synchronization points; and performing log replay on the synchronization point. Therefore, each record between adjacent rename operations of the synchronization points is determined as each record processed simultaneously, and then each record can be classified according to the file path prefix of each record in the editing log, so that the dependency relationship of each record in each type of record can be ensured; furthermore, the records in the top of each record type can be replayed by adopting a plurality of threads, so that the replay speed of the log is increased, and the unavailable time of the distributed file system can be ensured to be shortened; also, the time-series relationship in each record in the edit log is not broken.
fig. 3 is a block diagram illustrating a third embodiment of an application recommendation device according to an exemplary embodiment. Referring to fig. 3, the apparatus includes:
The classification module 31 is configured to classify the records into at least one type of record according to the file identifiers of the records in the editing log, wherein each type of record is a record with a dependency relationship;
A determining module 32 configured to determine the number of threads for replay of the log according to the class number of the divided records;
And the replay module 33 is configured to replay the logs of the various types of records corresponding to the threads one to one by using the threads of the determined number of threads.
Wherein the file identifier is a file path prefix.
the classification module 31 is specifically configured to:
Determining each synchronization point in the editing log according to each record in the editing log;
determining each record between adjacent synchronization points as each record processed simultaneously;
and classifying the records determined to be processed simultaneously into at least one type of records according to the file identification of the records determined to be processed simultaneously.
The determining module 32 is specifically configured to:
and determining the number of threads between each two adjacent synchronous points for each type of record between each two adjacent synchronous points.
The replay module 33 is specifically configured to:
Repeating the following process until log replay is completed for all records in the edit log:
Performing log replay on various records corresponding to each thread one to one by adopting threads with the number of the threads between adjacent synchronization points;
And performing log replay on the synchronization point.
and the synchronization point is rename operation in the editing log.
With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.
the embodiment determines each synchronization point in the editing log according to each record in the editing log; determining each record between adjacent synchronization points as each record processed simultaneously; and classifying the records determined to be processed simultaneously into at least one type of records according to the file identification of the records determined to be processed simultaneously. And determining the number of threads between each two adjacent synchronous points for each type of record between each two adjacent synchronous points. Repeating the following process until the log replay is completed for all records in the edit log: performing log replay on various records corresponding to each thread one to one by adopting threads with the number of the threads between adjacent synchronization points; and performing log replay on the synchronization point. Therefore, each record between adjacent rename operations of the synchronization points is determined as each record processed simultaneously, and then each record can be classified according to the file path prefix of each record in the editing log, so that the dependency relationship of each record in each type of record can be ensured; furthermore, the records in the top of each record type can be replayed by adopting a plurality of threads, so that the replay speed of the log is increased, and the unavailable time of the distributed file system can be ensured to be shortened; also, the time-series relationship in each record in the edit log is not broken.
fig. 4 is a block diagram illustrating entities of a terminal according to an example embodiment. Referring to fig. 4, the terminal may be embodied as: a processor 71, and a memory 72 configured to store processor-executable instructions;
wherein the processor 71 is configured to: dividing the records into at least one type of record according to the file identification of each record in the editing log, wherein each type of record is a record with a dependency relationship; determining the number of threads for replaying the log according to the class number of the divided records; and performing log replay on various records corresponding to the threads one by adopting the threads with the determined thread quantity.
In the above embodiments, it should be understood that the Processor may be a Central Processing Unit (CPU), other general purpose processors, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), etc. The general-purpose processor may be a microprocessor, or the processor may be any conventional processor, and the aforementioned memory may be a read-only memory (ROM), a Random Access Memory (RAM), a flash memory, a hard disk, or a solid state disk. SIM cards, also known as subscriber identity cards, smart cards, must be installed in a digital mobile phone for use. That is, the information of the digital mobile phone client, the encrypted key and the contents of the user's phone book are stored on the computer chip. The steps of a method disclosed in connection with the embodiments of the present invention may be directly implemented by a hardware processor, or may be implemented by a combination of hardware and software modules in the processor.
with regard to the terminal in the above-described embodiment, the specific manner in which each module performs operations has been described in detail in the embodiment related to the method and apparatus, and will not be elaborated herein.
In the embodiment, the records are divided into at least one type of record according to the file identifier of each record in the editing log, wherein each type of record is a record with a dependency relationship; determining the number of threads for replaying the log according to the class number of the divided records; and performing log replay on various records corresponding to the threads one by adopting the threads with the determined thread quantity. Therefore, the records can be classified according to the file path prefixes of the records in the editing log, and the dependency relationship of the records in each type of record can be ensured; furthermore, the records in the top of each record type can be replayed by adopting a plurality of threads, so that the replay speed of the log is increased, and the unavailable time of the distributed file system can be ensured to be shortened; also, the time-series relationship in each record in the edit log is not broken.
Fig. 5 is a block diagram illustrating a terminal device 800 according to an example embodiment. For example, the terminal device 800 may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, a fitness device, a personal digital assistant, and the like.
referring to fig. 5, terminal device 800 may include one or more of the following components: processing component 802, memory 804, power component 806, multimedia component 808, audio component 810, input/output (I/O) interface 812, sensor component 814, and communication component 816.
The processing component 802 generally controls overall operation of the terminal device 800, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing components 802 may include one or more processors 820 to execute instructions to perform all or a portion of the steps of the methods described above. Further, the processing component 802 can include one or more modules that facilitate interaction between the processing component 802 and other components. For example, the processing component 802 can include a multimedia module to facilitate interaction between the multimedia component 808 and the processing component 802.
The memory 804 is configured to store various types of data to support operations at the terminal device 800. Examples of such data include instructions for any application or method operating on terminal device 800, contact data, phonebook data, messages, pictures, videos, and so forth. The memory 804 may be implemented by any type or combination of volatile or non-volatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.
Power components 806 provide power to the various components of terminal device 800. Power components 806 may include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for terminal device 800.
The multimedia component 808 comprises a screen providing an output interface between the terminal device 800 and a user. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 808 includes a front facing camera and/or a rear facing camera. When the terminal device 800 is in an operation mode, such as a shooting mode or a video mode, the front camera and/or the rear camera may receive external multimedia data. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.
the audio component 810 is configured to output and/or input audio signals. For example, the audio component 810 includes a Microphone (MIC) configured to receive an external audio signal when the terminal device 800 is in an operation mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may further be stored in the memory 804 or transmitted via the communication component 816. In some embodiments, audio component 810 also includes a speaker for outputting audio signals.
The I/O interface 812 provides an interface between the processing component 802 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to: a home button, a volume button, a start button, and a lock button.
Sensor component 814 includes one or more sensors for providing various aspects of state assessment for terminal device 800. For example, sensor assembly 814 may detect an open/closed status of terminal device 800, the relative positioning of components, such as a display and keypad of terminal device 800, sensor assembly 814 may also detect a change in the position of terminal device 800 or a component of terminal device 800, the presence or absence of user contact with terminal device 800, orientation or acceleration/deceleration of terminal device 800, and a change in the temperature of terminal device 800. Sensor assembly 814 may include a proximity sensor configured to detect the presence of a nearby object without any physical contact. The sensor assembly 814 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 814 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
Communication component 816 is configured to facilitate communications between terminal device 800 and other devices in a wired or wireless manner. The terminal device 800 may access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof. In an exemplary embodiment, the communication component 816 receives a broadcast signal or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 816 further includes a Near Field Communication (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.
In an exemplary embodiment, the terminal device 800 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors or other electronic components for performing the above-described methods.
in an exemplary embodiment, a non-transitory computer-readable storage medium comprising instructions, such as the memory 804 comprising instructions, executable by the processor 820 of the terminal device 800 to perform the above-described method is also provided. For example, the non-transitory computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.
A non-transitory computer-readable storage medium having instructions therein, which when executed by a processor of a mobile terminal, enable the mobile terminal to perform a log replay method applied to a distributed file system, the method comprising:
Dividing the records into at least one type of record according to the file identification of each record in the editing log, wherein each type of record is a record with a dependency relationship;
Determining the number of threads for replaying the log according to the class number of the divided records;
And performing log replay on various records corresponding to the threads one by adopting the threads with the determined thread quantity.
Other embodiments of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the invention being indicated by the following claims.
it will be understood that the invention is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the invention is limited only by the appended claims.

Claims (11)

1. A log replay method applied to a distributed file system is characterized by comprising the following steps:
dividing the records into at least one type of record according to the file identification of each record in the editing log, wherein each type of record is a record with a dependency relationship;
Determining the number of threads for replaying the log according to the class number of the divided records;
performing log replay on various records corresponding to the threads one by adopting the threads with the determined thread quantity;
Wherein, the dividing the records into at least one type of record according to the file identification of each record in the editing log comprises:
Determining each synchronization point in the editing log according to each record in the editing log;
determining each record between adjacent synchronization points as each record processed simultaneously;
and classifying the records determined to be processed simultaneously into at least one type of records according to the file identification of the records determined to be processed simultaneously.
2. The method of claim 1, wherein determining the number of threads for replay of the log according to the class number of the divided records comprises:
and determining the number of threads between each two adjacent synchronous points for each type of record between each two adjacent synchronous points.
3. The method of claim 2, wherein said replaying logs of types of records that correspond one-to-one to each thread using each thread of the determined number of threads comprises:
Repeating the following process until log replay is completed for all records in the edit log:
Performing log replay on various records corresponding to each thread one to one by adopting threads with the number of the threads between adjacent synchronization points;
And performing log replay on the synchronization point.
4. The method of claim 1, wherein the synchronization point is a rename operation in the edit log.
5. The method of any of claims 1-4, wherein the file identification is a file path prefix.
6. a log replaying apparatus applied to a distributed file system, comprising:
The classification module is configured to classify the records into at least one type of record according to the file identification of each record in the editing log, wherein each type of record is a record with a dependency relationship;
A determining module configured to determine the number of threads for replay of the log according to the class number of the divided records;
The replay module is configured to replay logs of various records corresponding to the threads one by adopting the threads with the determined thread quantity;
Wherein the classification module is specifically configured to:
determining each synchronization point in the editing log according to each record in the editing log;
Determining each record between adjacent synchronization points as each record processed simultaneously;
and classifying the records determined to be processed simultaneously into at least one type of records according to the file identification of the records determined to be processed simultaneously.
7. the apparatus of claim 6, wherein the determination module is specifically configured to:
and determining the number of threads between each two adjacent synchronous points for each type of record between each two adjacent synchronous points.
8. The apparatus of claim 7, wherein the replay module is specifically configured to:
Repeating the following process until log replay is completed for all records in the edit log:
Performing log replay on various records corresponding to each thread one to one by adopting threads with the number of the threads between adjacent synchronization points;
and performing log replay on the synchronization point.
9. the apparatus of claim 6, wherein the synchronization point is a rename operation in the edit log.
10. the apparatus according to any of claims 6-9, wherein the file identification is a file path prefix.
11. A terminal, comprising:
A processor, and a memory for storing processor-executable instructions;
Wherein the processor is configured to: dividing the records into at least one type of record according to the file identification of each record in the editing log, wherein each type of record is a record with a dependency relationship; determining the number of threads for replaying the log according to the class number of the divided records; performing log replay on various records corresponding to the threads one by adopting the threads with the determined thread quantity;
Wherein, the dividing the records into at least one type of record according to the file identification of each record in the editing log comprises:
determining each synchronization point in the editing log according to each record in the editing log;
Determining each record between adjacent synchronization points as each record processed simultaneously;
And classifying the records determined to be processed simultaneously into at least one type of records according to the file identification of the records determined to be processed simultaneously.
CN201710109052.6A 2017-02-27 2017-02-27 Log replay method, device and terminal applied to distributed file system Active CN106919679B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710109052.6A CN106919679B (en) 2017-02-27 2017-02-27 Log replay method, device and terminal applied to distributed file system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710109052.6A CN106919679B (en) 2017-02-27 2017-02-27 Log replay method, device and terminal applied to distributed file system

Publications (2)

Publication Number Publication Date
CN106919679A CN106919679A (en) 2017-07-04
CN106919679B true CN106919679B (en) 2019-12-13

Family

ID=59454430

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710109052.6A Active CN106919679B (en) 2017-02-27 2017-02-27 Log replay method, device and terminal applied to distributed file system

Country Status (1)

Country Link
CN (1) CN106919679B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110795318B (en) * 2018-08-01 2023-05-02 阿里云计算有限公司 Data processing method and device and electronic equipment
CN109885543A (en) * 2018-12-24 2019-06-14 航天信息股份有限公司 Log processing method and device based on big data cluster
CN113868028A (en) * 2020-06-30 2021-12-31 华为技术有限公司 Method for replaying log on data node, data node and system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103064761A (en) * 2012-12-24 2013-04-24 华为技术有限公司 Data synchronization method, device and system
CN105045917A (en) * 2015-08-20 2015-11-11 北京百度网讯科技有限公司 Example-based distributed data recovery method and device
CN106294626A (en) * 2016-08-02 2017-01-04 极道科技(北京)有限公司 A kind of method that parallel playback file system redoes log

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103064761A (en) * 2012-12-24 2013-04-24 华为技术有限公司 Data synchronization method, device and system
CN105045917A (en) * 2015-08-20 2015-11-11 北京百度网讯科技有限公司 Example-based distributed data recovery method and device
CN106294626A (en) * 2016-08-02 2017-01-04 极道科技(北京)有限公司 A kind of method that parallel playback file system redoes log

Also Published As

Publication number Publication date
CN106919679A (en) 2017-07-04

Similar Documents

Publication Publication Date Title
CN107370772B (en) account login method and device and computer readable storage medium
CN105845124B (en) Audio processing method and device
CN108259991B (en) Video processing method and device
CN107193678B (en) Method and device for determining cause of stuck and storage medium
US20220248083A1 (en) Method and apparatus for video playing
CN106919629B (en) Method and device for realizing information screening in group chat
CN104462296B (en) File management method and device and terminal
CN107562349B (en) Method and device for executing processing
RU2645282C2 (en) Method and device for calling via cloud-cards
CN106919679B (en) Log replay method, device and terminal applied to distributed file system
CN104503786B (en) Firmware refreshing method and device
CN108153488B (en) Data self-adding method and device
CN105095296B (en) File management method and device
CN107733674B (en) Component upgrading method and terminal
CN109522286B (en) Processing method and device of file system
CN108345434B (en) Data writing method and device, computer equipment and readable storage medium
WO2017092138A1 (en) User information push method and apparatus
CN106060253B (en) Information presentation method and device
CN106528197B (en) Shooting method and device
CN112506700B (en) Conflict processing method, device, electronic equipment and storage medium
CN111290882B (en) Data file backup method, data file backup device and electronic equipment
CN108509641B (en) File backup method, device, server and system
CN109743441B (en) Read-write permission setting method and device
CN114003558A (en) Data archiving method, device, equipment and storage medium
CN116489247A (en) Device and method for editing random network protocol message programmable in operation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant