CN115470040A - Method, device, equipment and medium for testing re-deleted fingerprint threshold based on snapshot - Google Patents

Method, device, equipment and medium for testing re-deleted fingerprint threshold based on snapshot Download PDF

Info

Publication number
CN115470040A
CN115470040A CN202211033617.4A CN202211033617A CN115470040A CN 115470040 A CN115470040 A CN 115470040A CN 202211033617 A CN202211033617 A CN 202211033617A CN 115470040 A CN115470040 A CN 115470040A
Authority
CN
China
Prior art keywords
deduplication
data file
deleted
snapshot
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211033617.4A
Other languages
Chinese (zh)
Inventor
苏宁宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Inspur Intelligent Technology Co Ltd
Original Assignee
Suzhou Inspur Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Inspur Intelligent Technology Co Ltd filed Critical Suzhou Inspur Intelligent Technology Co Ltd
Priority to CN202211033617.4A priority Critical patent/CN115470040A/en
Publication of CN115470040A publication Critical patent/CN115470040A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore
    • G06F11/1453Management of the data involved in backup or backup restore using de-duplication of the data

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention belongs to the technical field of data deduplication, and particularly provides a method, a device, equipment and a medium for testing a fingerprint threshold deduplication value based on a snapshot, wherein the method comprises the following steps: presetting a deduplication data file written in a data volume; copying the re-deleted data file and writing the re-deleted data file into a data volume, and performing ROW snapshot on the data volume once during writing and recording the writing times; judging whether the recorded writing times reach a deduplication fingerprint threshold value or not when the deduplication data file written into the data volume is consistent with a preset deduplication data file according to the ROW snapshot; if so, copying the re-deleted data file and writing the re-deleted data file into the data volume, making a ROW snapshot on the data volume during writing, and analyzing metadata in the ROW snapshot when judging that the re-deleted data file written into the data volume is consistent with a preset re-deleted data file according to the ROW snapshot; and judging that the output test is passed when two physical addresses exist in the metadata according to the analysis result.

Description

Method, device, equipment and medium for testing re-deleted fingerprint threshold based on snapshot
Technical Field
The invention relates to the technical field of data deduplication, in particular to a method, a device, equipment and a medium for testing a fingerprint threshold deduplication value based on a snapshot.
Background
Snapshot, a Snapshot technique, is widely used in backup disaster recovery. A snapshot is a complete, usable copy of a particular data set that contains a static image of the source data at the point of copy; the snapshot may be a copy or duplication of the data reproduction. According to the definition of SNIA, snapshots are of two types, a full snapshot and an incremental snapshot, each of which uses a different snapshot technique. COW is called copy-on-write, or copy-before-write. After creating a snapshot, if the data of the source volume changes, the snapshot system will first copy the original data to the corresponding data block on the snapshot volume, and then rewrite the source volume. ROW is called write-redirect. The concept is opposite to COW. After creating the snapshot, the snapshot system redirects the write request of the data volume to the storage space reserved by the snapshot, when directly writing new data into the upper service reading source volume of the snapshot volume, the data before creating the snapshot is read from the source volume, and the data generated after creating the snapshot is read from the snapshot volume. It can avoid the performance loss caused by two write operations.
For a deduplication volume data write, a data write records a logical address (LBA), a physical address (PBA), and a fingerprint lock (HBA). The deduplication technology is mainly used for performing fingerprint calculation on write-in data, only one piece of data (PBA) and corresponding metadata (namely, a mapping relation L- (P) from a corresponding logical address LBA to a physical address PBA when the metadata is stored for data write-in) and fingerprint lock (HBA) data are reserved, repeated data can be deleted, the metadata (L-P) is recorded, the number of repeated data deletion generally has an upper limit, and namely the deduplication rate has the maximum specification. For example, the Langchao MCS system sets the deduplication fingerprint lock threshold to 32. Beyond 32 duplicates, new fingerprint data may be recorded. When the source volume is opened with the re-deleting function, the ROW snapshot processing is carried out, and the re-deleting processing of the data, the metadata and the fingerprint lock data, and the reading and writing of the metadata in the ROW snapshot are involved.
Currently, for the test of the ROW snapshot and the deduplication volume, an io read-write tool such as a vdbech is generally adopted to adjust a deduplication rate parameter for reading and writing. In order to improve the accuracy of the test, a large amount of data is generally written, the writing duration is prolonged, and the data writing amount is adopted, so that the triggering probability during writing is improved. When the deduplication threshold is in the scene of ROW snapshot, the processing condition that deduplication data reaches the threshold cannot be accurately tested in the test process can only be that the test strength is increased and a large number of repeated tests are performed to improve the probability, and the processing process cannot be guaranteed to be necessarily tested.
Disclosure of Invention
Currently, for the test of the ROW snapshot and the deduplication volume, an io read-write tool such as a vdbech is generally adopted to adjust a deduplication rate parameter for reading and writing. In order to improve the accuracy of the test, a large amount of data is generally written, the writing duration is prolonged, and the data writing amount is adopted, so that the triggering probability during writing is improved. Aiming at the processing situation that the deduplication threshold value cannot be accurately tested in the test process until the deduplication data reaches the threshold value under the ROW snapshot scene, the test force can only be increased, a large number of repeated tests are carried out to improve the probability, and the processing process cannot be necessarily tested.
In a first aspect, a technical solution of the present invention provides a method for testing a deduplication fingerprint threshold based on a snapshot, including the following steps:
presetting a deduplication data file written in a data volume;
copying the re-deleted data file and writing the re-deleted data file into a data volume, and making a ROW snapshot on the data volume and recording the writing times during writing;
judging whether the recorded writing times reach a deduplication fingerprint threshold value or not when the deduplication data file written into the data volume is consistent with a preset deduplication data file according to the ROW snapshot;
if not, executing the following steps: copying the re-deleted data file and writing the re-deleted data file into a data volume, and making a ROW snapshot on the data volume and recording the writing times during writing;
if so, copying the re-deleted data file and writing the re-deleted data file into the data volume, making a ROW snapshot on the data volume during writing, and analyzing metadata in the ROW snapshot when judging that the re-deleted data file written into the data volume is consistent with a preset re-deleted data file according to the ROW snapshot;
and judging that when two physical addresses exist in the metadata according to the analysis result, the output test is passed, namely the processing flow is accurate and normal after the re-deleted fingerprint reaches the re-deleted fingerprint threshold.
According to the method, data (a re-deleted data file) is written into a data volume, one ROW snapshot is made, one metadata mapping is copied, re-deletion is triggered when repeated data is written, and when the number of times of writing the repeated data reaches a re-deleted fingerprint threshold value, the repeated data is continuously written, and a switching process of the re-deleted fingerprint threshold value is triggered.
Further, when the re-deleted data file written into the data volume is judged to be consistent with the preset re-deleted data file according to the ROW snapshot, the step of judging whether the recorded writing times reach the re-deleted fingerprint threshold value or not includes:
reading a re-deleted data file written in the data volume according to a metadata address mapping relation in the ROW snapshot;
carrying out consistency check on the acquired re-deleted data file and a preset re-deleted data file;
and when the verification is consistent, judging whether the recorded writing times reach a threshold value of the deleted fingerprint.
Further, copying the deduplication data file and writing the deduplication data file into the data volume, making a ROW snapshot on the data volume during writing, and analyzing metadata in the ROW snapshot in the step of analyzing the metadata in the ROW snapshot when the deduplication data file written into the data volume is judged to be consistent with the preset deduplication data file according to the ROW snapshot, wherein the step of analyzing the metadata in the ROW snapshot comprises:
acquiring a mapping relation between a logical address and a physical address in metadata;
judging whether the logical address and the physical address are in a many-to-one relationship;
if yes, a physical address exists in the metadata;
if not, two physical addresses exist in the metadata.
Further, the step of performing consistency check on the obtained deduplication data file and the preset deduplication data file further includes:
and outputting write-in abnormal prompt information when the verification is inconsistent.
When ROW snapshot is carried out on the data volume, the data is written in and the snapshot is matched to test the deduplication fingerprint threshold of the data volume, the triggering probability is not required to be improved through continuous writing of a large amount of data, and the method can accurately test the deduplication fingerprint threshold switching and processing mechanism so as to judge whether the deduplication fingerprint processing mechanism can normally process in the ROW snapshot.
In a second aspect, the technical solution of the present invention further provides a device for testing a re-deleted fingerprint threshold based on a snapshot, which includes a preset module, a write-in processing module, a first determining module, an analyzing module, and a test result output module;
the device comprises a presetting module, a data storage module and a data processing module, wherein the presetting module is used for presetting a deduplication data file written in a data volume;
the write-in processing module is used for copying the re-deleted data file and writing the re-deleted data file into the data volume, and making a ROW snapshot on the data volume and recording the write-in times during write-in;
the first judging module is used for judging whether the recorded writing times reach a deduplication fingerprint threshold value or not when the deduplication data file written in the data volume is consistent with a preset deduplication data file according to the ROW snapshot; triggering a write processing module;
the analysis module is used for the first judgment module to judge that the writing times reach the deduplication fingerprint threshold value to trigger the writing processing module, and after the writing processing module finishes execution, the analysis module analyzes the metadata in the ROW snapshot according to the ROW snapshot when judging that the deduplication data file written in the data volume is consistent with a preset deduplication data file;
and the test result output module is used for judging whether two physical addresses exist in the metadata according to the analysis result, and outputting that the test is passed, namely the processing flow is accurate and normal after the deduplication fingerprint reaches the deduplication fingerprint threshold.
According to the method, data (a re-deleted data file) is written into a data volume, one ROW snapshot is made, one metadata mapping is copied, re-deletion is triggered when the repeated data are written in, the repeated data are continuously written in when the number of times of writing the repeated data reaches a re-deleted fingerprint threshold, and a switching process of the re-deleted fingerprint threshold is triggered.
Further, the device also comprises a consistency checking module, which specifically comprises a reading unit and a checking unit;
the reading unit is used for reading the deleted data file written into the data volume according to the mapping relation of the metadata address in the ROW snapshot;
the verifying unit is used for performing consistency verification on the acquired re-deleted data file and a preset re-deleted data file;
and the first judgment module is specifically used for judging whether the recorded writing times reach a re-deletion fingerprint threshold value or not when the verification unit outputs the verification consistency.
Furthermore, the analysis module comprises an acquisition unit, an address judgment unit and an analysis result output unit;
the acquiring unit is used for acquiring the mapping relation between the logical address and the physical address in the metadata;
the address judgment unit is used for judging whether the logical address and the physical address are in a many-to-one relationship or not and triggering the analysis result output unit;
the analysis result output unit is used for outputting a physical address in the metadata according to the judgment result of the address judgment unit; or there are two physical addresses in the metadata.
Furthermore, the test result output module is also used for outputting write-in exception prompt information when the consistency check module outputs inconsistent check.
When ROW snapshot is carried out on the data volume, the data is written in and the snapshot is matched to test the deduplication fingerprint threshold of the data volume, the triggering probability is not required to be improved through continuous writing of a large amount of data, and the method can accurately test the deduplication fingerprint threshold switching and processing mechanism so as to judge whether the deduplication fingerprint processing mechanism can normally process in the ROW snapshot.
In a third aspect, an embodiment of the present invention further provides an electronic device, where the electronic device includes:
at least one processor; and (c) a second step of,
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores computer program instructions executable by the at least one processor to enable the at least one processor to perform the method for snapshot based deduplication fingerprint threshold testing as described in the first aspect.
In a fourth aspect, the present invention provides a non-transitory computer-readable storage medium storing computer instructions for causing a computer to perform the method for testing a snapshot-based deduplication fingerprint threshold according to the first aspect.
According to the technical scheme, the invention has the following advantages: the testing method used by the invention is used for testing the deleted fingerprint, verifying whether the deleted fingerprint can ensure the data consistency after the deleted fingerprint is deleted and verifying whether the processing flow is accurate and normal after the deleted fingerprint reaches the threshold. Different from a conventional test method, large data volume is continuously written, and a probabilistic test is performed, so that the method can accurately test the processing flow from the time of deduplication reaching a threshold value, and verify the processing flow of ROW snapshot and deduplication data, thereby improving the quality and stability of products.
In addition, the invention has reliable design principle, simple structure and very wide application prospect.
Therefore, compared with the prior art, the invention has prominent substantive features and remarkable progress, and the beneficial effects of the implementation are also obvious.
Drawings
In order to more clearly illustrate the embodiments or technical solutions in the prior art of the present invention, the drawings used in the description of the embodiments or prior art will be briefly described below, and it is obvious for those skilled in the art that other drawings can be obtained based on these drawings without creative efforts.
FIG. 1 is a schematic flow diagram of a method of one embodiment of the present invention.
Fig. 2 is a schematic flow diagram of a method of another embodiment of the invention.
Fig. 3 is a schematic block diagram of an apparatus of one embodiment of the present invention.
Detailed Description
For a deduplication volume data write, a data write records a logical address (LBA), a physical address (PBA), and a fingerprint lock (HBA). The deduplication technology is based on fingerprint calculation for write-in data, only one piece of data (PBA) and corresponding metadata (that is, a mapping relation L — > P from a corresponding logical address LBA to a physical address PBA when the metadata is stored for data write-in) and fingerprint lock (HBA) data are reserved, duplicated data can be deleted, the metadata (L-P) is recorded, the amount of duplicated data deletion generally has an upper limit, that is, the deduplication rate has the maximum specification. For example, the Langchao MCS system sets the deduplication fingerprint lock threshold to 32. Beyond 32 duplicates, new fingerprint data may be recorded. When the deduplication function is started in the source volume, then ROW snapshot processing is performed, at this time, data deduplication processing, metadata and fingerprint lock data, and metadata reading and writing in the ROW snapshot are involved.
Currently, for the test of the ROW snapshot and the deduplication volume, an io read-write tool such as a vdbech is generally adopted to adjust a deduplication rate parameter for reading and writing. In order to improve the accuracy of the test, a large amount of data is generally written, the writing duration is prolonged, and the data writing amount is adopted, so that the triggering probability during writing is improved. When the deduplication threshold is in the scene of ROW snapshot, the processing condition that deduplication data reaches the threshold cannot be accurately tested in the test process can only be that the test strength is increased and a large number of repeated tests are performed to improve the probability, and the processing process cannot be guaranteed to be necessarily tested. In order to make those skilled in the art better understand the technical solution of the present invention, the technical solution in the embodiment of the present invention will be clearly and completely described below with reference to the drawings in the embodiment of the present invention, and it is obvious that the described embodiment is only a part of the embodiment of the present invention, and not all embodiments. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, shall fall within the protection scope of the present invention.
As shown in fig. 1, an embodiment of the present invention provides a method for testing a deduplication fingerprint threshold based on a snapshot, including the following steps:
step 1: presetting a deduplication data file written in a data volume;
step 2: copying the re-deleted data file and writing the re-deleted data file into a data volume, and making a ROW snapshot on the data volume and recording the writing times during writing;
and step 3: judging whether the recorded writing times reach a deduplication fingerprint threshold value or not when the deduplication data file written into the data volume is consistent with a preset deduplication data file according to the ROW snapshot;
if not, executing the step 2;
if yes, executing step 4;
and 4, step 4: copying the re-deleted data file and writing the re-deleted data file into the data volume, making an ROW snapshot on the data volume during writing, and analyzing metadata in the ROW snapshot when the re-deleted data file written into the data volume is judged to be consistent with a preset re-deleted data file according to the ROW snapshot;
and 5: and judging that when two physical addresses exist in the metadata according to the analysis result, the output test is passed, namely the processing flow is accurate and normal after the re-deleted fingerprint reaches the re-deleted fingerprint threshold.
According to the method, data (a re-deleted data file) is written into a data volume, one ROW snapshot is made, one metadata mapping is copied, re-deletion is triggered when repeated data is written, and when the number of times of writing the repeated data reaches a re-deleted fingerprint threshold value, the repeated data is continuously written, and a switching process of the re-deleted fingerprint threshold value is triggered.
As shown in fig. 2, an embodiment of the present invention provides a method for testing a fingerprint threshold for deduplication based on a snapshot, including the following steps:
s1: presetting a deduplication data file written in a data volume;
s2: copying the re-deleted data file and writing the re-deleted data file into a data volume, and making a ROW snapshot on the data volume and recording the writing times during writing;
s3: reading a re-deleted data file written in the data volume according to a metadata address mapping relation in the ROW snapshot;
s4: carrying out consistency check on the acquired re-deleted data file and a preset re-deleted data file;
when the verification is consistent, executing the step S5; when the verification is inconsistent, executing the step S8;
s5: judging whether the recorded writing times reach a re-deleting fingerprint threshold value;
if not, executing the step S2;
if yes, executing step S6;
s6: copying the re-deleted data file and writing the re-deleted data file into the data volume, performing ROW snapshot on the data volume during writing, and analyzing metadata in the ROW snapshot when the re-deleted data file written into the data volume is judged to be consistent with a preset re-deleted data file according to the ROW snapshot; in this step, the step of analyzing the metadata in the ROW snapshot includes: acquiring a mapping relation between a logical address and a physical address in metadata; judging whether the logical address and the physical address are in a many-to-one relationship; if yes, a physical address exists in the metadata; if not, two physical addresses exist in the metadata;
s7: and judging that when two physical addresses exist in the metadata according to the analysis result, the output test is passed, namely the processing flow is accurate and normal after the re-deleted fingerprint reaches the re-deleted fingerprint threshold.
S8: and outputting the write exception prompt information.
When ROW snapshot is carried out on the data volume, the data is written in and the snapshot is matched to test the deduplication fingerprint threshold of the data volume, the triggering probability is not required to be improved through continuous writing of a large amount of data, and the method can accurately test the deduplication fingerprint threshold switching and processing mechanism so as to judge whether the deduplication fingerprint processing mechanism can normally process in the ROW snapshot.
Deduplication technology is generally divided into source-end deduplication and sink-end deduplication; the source terminal duplication removal firstly calculates fingerprints of data to be transmitted at a client terminal, finds and eliminates repeated contents by comparing the fingerprints with a server terminal, and only sends non-repeated data contents to the server terminal, so that the aim of saving network bandwidth and storage resources at the same time is fulfilled. And the sink terminal de-duplicates the data of the client terminal directly to the server terminal, and detects and eliminates the repeated content in the server terminal. Both deployment modes can provide storage space efficiency, and the main difference is that the source end duplication removal consumes the computing resources of the client end to replace the improvement of the network transmission efficiency.
The document relates to a deduplication technology, belonging to sink deduplication. The software system is stored based on the Langchao MCS, the online deduplication is based on a repeated data detection and reduction technology based on a pool, the online deduplication is realized based on software, and a hardware accelerator card is not needed. Before data is written into the hard disk, the writing times and the writing data amount of the SSD disk are reduced by detecting and deleting repeated data in real time, so that the storage space is saved, the abrasion of the SSD disk is reduced, and the service life of the SSD disk is prolonged. The embodiment of the invention also provides a method for testing the threshold value of the deduplication fingerprint in the ROW snapshot, when the ROW snapshot is made on the deduplication volume, the fingerprint threshold value of the deduplication volume is tested through the matching of the write-in data and the snapshot, the triggering probability is not required to be improved through the continuous write-in of a large amount of data, and the method can accurately test the threshold value switching and processing mechanism of the deduplication fingerprint so as to judge whether the deduplication fingerprint processing mechanism in the ROW snapshot can be normally processed or not. In a Langchao MCS system, ROW snapshot and deduplication functions can be supported only under a full flash stack, and the full flash stack writes data based on a log structure mode, so that received random data can be converted into sequential arrangement during disk writing, and the disk is dropped after the RAID full strip size is filled. The storage space after the storage pool is created is managed by an LSA volume (log structure), data is issued and stored in the LSA volume, and the mapping relation from a reduced volume address (LBA) corresponding to the data to an LSA volume address (PBA) is stored in metadata. The specific implementation process is as follows:
the specific implementation process is as follows:
1. preparing to write the re-deleted data of the data volume, such as file1, and repeatedly copying the file1 for multiple times;
2. after the file1 is written into the data volume for the first time, an ROW snapshot is made, and at the moment, metadata is copied once;
3. writing the copied file of the file1 into a data volume, and then making an ROW snapshot, wherein the copied metadata is the metadata deleted for the first time, and the deletion rate is 2:1;
4. writing the copied file of the file1 into the data volume for the second time, and performing ROW snapshot again, wherein the copied metadata is metadata deleted for the second time, and the deletion rate at this time is 3;
5. by analogy, after the 32 nd time of writing the copied file of the file1, the re-deleted fingerprint reaches the threshold value;
6. when the 33 rd copy file written in file1 reaches the threshold, the switch of the threshold of the fingerprint deduplication is triggered.
7. After the threshold value of the re-deleted fingerprint is reached, re-deleted data is written in, and a new fingerprint is recorded at the moment;
8. after data is written in each time, ROW snapshot is made on the source volume, and metadata content is copied once. And checking the data consistency through rollback of the ROW snapshot so as to determine whether the re-deleted data is correct or not. And (4) after the repeated data is written in for the 33 rd time, ROW snapshot is carried out, consistency check of snapshot data is carried out, and whether the switching processing of the fingerprint threshold is accurate or not can be verified.
As shown in fig. 3, an embodiment of the present invention further provides a device for testing a fingerprint threshold deleted again based on a snapshot, which includes a preset module, a write-in processing module, a first determining module, an analyzing module, and a test result output module;
the preset module is used for presetting the deduplication data file written in the data volume;
the write-in processing module is used for copying the re-deleted data file and writing the re-deleted data file into the data volume, and making a ROW snapshot on the data volume and recording the write-in times during write-in;
the first judgment module is used for judging whether the recorded writing times reach a deduplication fingerprint threshold value or not when the deduplication data file written into the data volume is consistent with a preset deduplication data file according to the ROW snapshot; triggering a write processing module;
the analysis module is used for the first judgment module to judge that the writing times reach the deduplication fingerprint threshold value to trigger the writing processing module, and after the writing processing module finishes execution, the analysis module analyzes the metadata in the ROW snapshot according to the ROW snapshot when judging that the deduplication data file written in the data volume is consistent with a preset deduplication data file;
and the test result output module is used for judging whether two physical addresses exist in the metadata according to the analysis result, and outputting that the test is passed, namely the processing flow is accurate and normal after the re-deleted fingerprint reaches the re-deleted fingerprint threshold.
According to the method, data (a re-deleted data file) is written into a data volume, one ROW snapshot is made, one metadata mapping is copied, re-deletion is triggered when repeated data is written, and when the number of times of writing the repeated data reaches a re-deleted fingerprint threshold value, the repeated data is continuously written, and a switching process of the re-deleted fingerprint threshold value is triggered.
The device also comprises a consistency checking module, which specifically comprises a reading unit and a checking unit;
the reading unit is used for reading the deleted data file written into the data volume according to the mapping relation of the metadata address in the ROW snapshot;
the verifying unit is used for performing consistency verification on the acquired re-deleted data file and a preset re-deleted data file;
and the first judgment module is specifically used for judging whether the recorded writing times reach a re-deletion fingerprint threshold value or not when the verification unit outputs the verification consistency.
The analysis module comprises an acquisition unit, an address judgment unit and an analysis result output unit;
the acquiring unit is used for acquiring the mapping relation between the logical address and the physical address in the metadata;
the address judgment unit is used for judging whether the logical address and the physical address are in a many-to-one relationship or not and triggering the analysis result output unit;
the analysis result output unit is used for outputting a physical address in the metadata according to the judgment result of the address judgment unit; or there are two physical addresses in the metadata.
And the test result output module is also used for outputting write-in abnormal prompt information when the consistency check module outputs the check inconsistency.
When ROW snapshot is carried out on the data volume, the data is written in and the snapshot is matched to test the deduplication fingerprint threshold of the data volume, the triggering probability is not required to be improved through continuous writing of a large amount of data, and the method can accurately test the deduplication fingerprint threshold switching and processing mechanism so as to judge whether the deduplication fingerprint processing mechanism can normally process in the ROW snapshot.
An embodiment of the present invention further provides an electronic device, where the electronic device includes: the system comprises a processor, a communication interface, a memory and a bus, wherein the processor, the communication interface and the memory are communicated with each other through the bus. The bus may be used for information transfer between the electronic device and the sensor. The processor may call logic instructions in memory to perform the following method: step 1: presetting a deduplication data file written in a data volume; and 2, step: copying the re-deleted data file and writing the re-deleted data file into a data volume, and making a ROW snapshot on the data volume and recording the writing times during writing; and step 3: judging whether the recorded writing times reach a deduplication fingerprint threshold value or not when the deduplication data file written into the data volume is consistent with a preset deduplication data file according to the ROW snapshot; if not, executing the step 2; if yes, executing step 4; and 4, step 4: copying the re-deleted data file and writing the re-deleted data file into the data volume, making an ROW snapshot on the data volume during writing, and analyzing metadata in the ROW snapshot when the re-deleted data file written into the data volume is judged to be consistent with a preset re-deleted data file according to the ROW snapshot; and 5: and judging that when two physical addresses exist in the metadata according to the analysis result, the output test is passed, namely the processing flow is accurate and normal after the re-deleted fingerprint reaches the re-deleted fingerprint threshold.
In addition, the logic instructions in the memory may be implemented in the form of software functional units and stored in a computer readable storage medium when the logic instructions are sold or used as independent products. Based on such understanding, the technical solution of the present invention or a part thereof which substantially contributes to the prior art may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
Embodiments of the present invention provide a non-transitory computer-readable storage medium storing computer instructions that cause a computer to perform a method provided by the above method embodiments, for example, including: s1: presetting a deduplication data file written in a data volume; s2: copying the re-deleted data file and writing the re-deleted data file into a data volume, and making a ROW snapshot on the data volume and recording the writing times during writing; s3: reading a re-deleted data file written in the data volume according to a metadata address mapping relation in the ROW snapshot; s4: carrying out consistency check on the acquired re-deleted data file and a preset re-deleted data file; when the check is consistent, executing the step S5; when the verification is inconsistent, executing the step S8; s5: judging whether the recorded writing times reach a re-deleting fingerprint threshold value or not; if not, executing the step S2; if yes, executing step S6; s6: copying the re-deleted data file and writing the re-deleted data file into the data volume, making an ROW snapshot on the data volume during writing, and analyzing metadata in the ROW snapshot when the re-deleted data file written into the data volume is judged to be consistent with a preset re-deleted data file according to the ROW snapshot; in this step, the step of analyzing the metadata in the ROW snapshot includes: acquiring a mapping relation between a logical address and a physical address in metadata; judging whether the logical address and the physical address are in a many-to-one relationship; if yes, a physical address exists in the metadata; if not, two physical addresses exist in the metadata; s7: and judging that the output test is passed when two physical addresses exist in the metadata according to the analysis result, namely the processing flow is accurate and normal after the re-deleted fingerprint reaches the re-deleted fingerprint threshold. S8: and outputting the write exception prompt information.
Although the present invention has been described in detail by referring to the drawings in connection with the preferred embodiments, the present invention is not limited thereto. Various equivalent modifications or substitutions can be made on the embodiments of the present invention by those skilled in the art without departing from the spirit and scope of the present invention, and these modifications or substitutions are within the scope of the present invention/any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the appended claims.

Claims (10)

1. A method for testing a fingerprint threshold of deduplication based on a snapshot is characterized by comprising the following steps:
presetting a deduplication data file written in a data volume;
copying the re-deleted data file and writing the re-deleted data file into a data volume, and making a ROW snapshot on the data volume and recording the writing times during writing;
judging whether the recorded writing times reach a deduplication fingerprint threshold value or not when the deduplication data file written in the data volume is consistent with a preset deduplication data file according to the ROW snapshot;
if not, executing the following steps: copying the re-deleted data file and writing the re-deleted data file into a data volume, and performing ROW snapshot on the data volume once during writing and recording the writing times;
if so, copying the re-deleted data file and writing the re-deleted data file into the data volume, making a ROW snapshot on the data volume during writing, and analyzing metadata in the ROW snapshot when judging that the re-deleted data file written into the data volume is consistent with a preset re-deleted data file according to the ROW snapshot;
and judging that the output test is passed when two physical addresses exist in the metadata according to the analysis result, namely the processing flow is accurate and normal after the re-deleted fingerprint reaches the re-deleted fingerprint threshold.
2. The method according to claim 1, wherein the step of determining whether the recorded writing times reach the threshold value of the deduplication fingerprint when determining that the deduplication data file written into the data volume is consistent with the preset deduplication data file according to the ROW snapshot comprises:
reading a re-deleted data file written in the data volume according to a metadata address mapping relation in the ROW snapshot;
carrying out consistency check on the acquired re-deleted data file and a preset re-deleted data file;
and when the verification is consistent, judging whether the recorded writing times reach a threshold value of the re-deleted fingerprint.
3. The method for testing the fingerprint threshold for deduplication based on the snapshot as claimed in claim 2, wherein the step of copying the deduplication data file and writing the deduplication data file into the data volume, making a ROW snapshot on the data volume during writing, and when the deduplication data file written into the data volume is determined to be consistent with the preset deduplication data file according to the ROW snapshot, parsing the metadata in the ROW snapshot includes:
acquiring a mapping relation between a logical address and a physical address in metadata;
judging whether the logical address and the physical address are in a many-to-one relationship;
if yes, judging that a physical address exists in the metadata;
if not, two physical addresses exist in the metadata.
4. The method for testing the snapshot based deduplication fingerprint threshold of claim 3, wherein the step of performing consistency check on the obtained deduplication data file and the preset deduplication data file further comprises:
and outputting write-in abnormal prompt information when the verification is inconsistent.
5. A testing device for a fingerprint threshold re-deleted based on a snapshot is characterized by comprising a presetting module, a writing processing module, a first judging module, an analyzing module and a test result output module;
the device comprises a presetting module, a data storage module and a data processing module, wherein the presetting module is used for presetting a deduplication data file written in a data volume;
the write-in processing module is used for copying the re-deleted data file and writing the re-deleted data file into the data volume, and making a ROW snapshot on the data volume and recording the write-in times during write-in;
the first judgment module is used for judging whether the recorded writing times reach a deduplication fingerprint threshold value or not when the deduplication data file written into the data volume is consistent with a preset deduplication data file according to the ROW snapshot; triggering a write processing module;
the analysis module is used for the first judgment module to judge that the writing times reach the deduplication fingerprint threshold value to trigger the writing processing module, and after the writing processing module finishes execution, the analysis module analyzes the metadata in the ROW snapshot when the deduplication data file written into the data volume is judged to be consistent with a preset deduplication data file according to the ROW snapshot;
and the test result output module is used for judging whether two physical addresses exist in the metadata according to the analysis result, and outputting that the test is passed, namely the processing flow is accurate and normal after the deduplication fingerprint reaches the deduplication fingerprint threshold.
6. The apparatus according to claim 5, further comprising a consistency check module, specifically comprising a reading unit and a check unit;
the reading unit is used for reading the deleted data file written into the data volume according to the mapping relation of the metadata address in the ROW snapshot;
the verifying unit is used for performing consistency verification on the acquired deduplication data file and a preset deduplication data file;
and the first judgment module is specifically used for judging whether the recorded writing times reach a re-deleted fingerprint threshold value or not when the verification unit outputs the verification consistency.
7. The apparatus for testing a fingerprint threshold for deduplication based on a snapshot according to claim 6, wherein the parsing module includes an obtaining unit, an address determining unit, and a parsing result outputting unit;
the acquiring unit is used for acquiring the mapping relation between the logical address and the physical address in the metadata;
the address judging unit is used for judging whether the logical address and the physical address are in a many-to-one relationship or not and triggering the analysis result output unit;
the analysis result output unit is used for outputting a physical address in the metadata according to the judgment result of the address judgment unit; or there are two physical addresses in the metadata.
8. The apparatus according to claim 7, wherein the test result output module is further configured to output a write exception notification message when the consistency check module outputs the check inconsistency.
9. An electronic device, characterized in that the electronic device comprises:
at least one processor; and the number of the first and second groups,
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores computer program instructions executable by the at least one processor to cause the at least one processor to perform the method for snapshot based deduplication fingerprint threshold testing of any of claims 1-4.
10. A non-transitory computer-readable storage medium storing computer instructions for causing a computer to perform the method for snapshot based deduplication fingerprint threshold testing of any one of claims 1-4.
CN202211033617.4A 2022-08-26 2022-08-26 Method, device, equipment and medium for testing re-deleted fingerprint threshold based on snapshot Pending CN115470040A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211033617.4A CN115470040A (en) 2022-08-26 2022-08-26 Method, device, equipment and medium for testing re-deleted fingerprint threshold based on snapshot

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211033617.4A CN115470040A (en) 2022-08-26 2022-08-26 Method, device, equipment and medium for testing re-deleted fingerprint threshold based on snapshot

Publications (1)

Publication Number Publication Date
CN115470040A true CN115470040A (en) 2022-12-13

Family

ID=84371040

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211033617.4A Pending CN115470040A (en) 2022-08-26 2022-08-26 Method, device, equipment and medium for testing re-deleted fingerprint threshold based on snapshot

Country Status (1)

Country Link
CN (1) CN115470040A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116204124A (en) * 2023-02-23 2023-06-02 安超云软件有限公司 Data processing method and system based on conflict lock and electronic equipment

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116204124A (en) * 2023-02-23 2023-06-02 安超云软件有限公司 Data processing method and system based on conflict lock and electronic equipment
CN116204124B (en) * 2023-02-23 2024-03-22 安超云软件有限公司 Data processing method and system based on conflict lock and electronic equipment

Similar Documents

Publication Publication Date Title
US7103811B2 (en) Mechanisms for detecting silent errors in streaming media devices
US10261705B2 (en) Efficient data consistency verification for flash storage
CN111158948B (en) Data storage and verification method and device based on deduplication and storage medium
CN110727597B (en) Method for checking invalid code completion case based on log
WO2023000674A1 (en) Method and apparatus for data compression, backup and recovery of cloud hard disk, device and storage medium
CN116107516B (en) Data writing method and device, solid state disk, electronic equipment and storage medium
CN109918226A (en) A kind of silence error-detecting method, device and storage medium
CN110674145B (en) Data consistency detection method, device, computer equipment and storage medium
CN115470040A (en) Method, device, equipment and medium for testing re-deleted fingerprint threshold based on snapshot
US10346610B1 (en) Data protection object store
US20120158652A1 (en) System and method for ensuring consistency in raid storage array metadata
CN110222035A (en) A kind of efficient fault-tolerance approach of database page based on exclusive or check and journal recovery
KR101889222B1 (en) Portable storage device perfoming a malignant code detection and method for the same
CN115658404A (en) Test method and system
US11416330B2 (en) Lifecycle of handling faults in next generation storage systems
CN114155906A (en) Data block repairing method, device, equipment and storage medium
CN114415970A (en) Disk fault processing method and device for distributed storage system and server
CN114064361A (en) Data writing method executed in backup related operation and backup gateway system
CN112486717A (en) Method, system, terminal and storage medium for verifying consistency of disk data
CN110008227B (en) Consistency group reliability verification method and related device
CN115390764A (en) Automatic verification method, device, equipment and storage medium for deduplication rate
CN114281246B (en) Cloud hard disk online migration method, device and equipment based on cloud management platform
CN115016988B (en) CDP backup recovery method, system and storage medium based on binary tree log
CN115599589B (en) Data recovery method and related device
CN114153647B (en) Rapid data verification method, device and system for cloud storage system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination