CN109831487B - Fragmented file verification method and terminal equipment - Google Patents

Fragmented file verification method and terminal equipment Download PDF

Info

Publication number
CN109831487B
CN109831487B CN201910014713.6A CN201910014713A CN109831487B CN 109831487 B CN109831487 B CN 109831487B CN 201910014713 A CN201910014713 A CN 201910014713A CN 109831487 B CN109831487 B CN 109831487B
Authority
CN
China
Prior art keywords
data
file
target
fragment
fragmented
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910014713.6A
Other languages
Chinese (zh)
Other versions
CN109831487A (en
Inventor
雷琼
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN201910014713.6A priority Critical patent/CN109831487B/en
Publication of CN109831487A publication Critical patent/CN109831487A/en
Priority to PCT/CN2019/118145 priority patent/WO2020143317A1/en
Application granted granted Critical
Publication of CN109831487B publication Critical patent/CN109831487B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/40Support for services or applications

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention is suitable for the technical field of computer application, and provides a fragmented file verification method, terminal equipment and a computer readable storage medium, wherein the method comprises the following steps: the method comprises the steps of determining original data to be fragmented and stored, fragmenting the original data according to a preset fragmentation mode to obtain fragmented data, and sending the fragmented data to corresponding storage nodes. Selecting data with preset data size at the positions of at least two target files in the fragmented files to be verified as target fragmented summaries; combining all target file positions to generate a challenge code, and sending the challenge code to a storage node for storing the fragmented file; and receiving verification data sent by the storage node, generating target data according to the target fragment abstract and the data generation algorithm, and comparing the verification data with the target data to verify the possession of the fragment data, so that the data possession verification efficiency is improved.

Description

Fragmented file verification method and terminal equipment
Technical Field
The invention belongs to the technical field of computer application, and particularly relates to a fragmented file verification method, terminal equipment and a computer readable storage medium.
Background
The traditional network storage system adopts a centralized storage server to store all data, the storage server becomes the bottleneck of the system performance, is also the focus of reliability and safety, and cannot meet the requirement of large-scale storage application. In the prior art, by adopting an expandable system structure of a distributed network storage system, a plurality of storage servers are used for sharing storage load, and a position server is used for positioning storage information, so that the reliability, the availability and the access efficiency of the system are improved, and the system is easy to expand. But it is very likely that the storage nodes in the distributed network cannot guarantee data persistence in the cloud server nodes.
Disclosure of Invention
In view of this, embodiments of the present invention provide a fragmented file verification method, a terminal device, and a computer-readable storage medium, so as to solve a problem that a storage node of a distributed network in the prior art cannot guarantee data persistency in a cloud server node.
A first aspect of an embodiment of the present invention provides a fragmented file verification method, including:
selecting data with preset data size at the positions of at least two target files in the fragmented files to be verified as target fragmented summaries;
randomly combining all the target file positions to obtain a challenge code in a character string form, and sending the challenge code to a storage node for storing the fragment file;
receiving verification data sent by the storage node; the verification data is generated according to the target file position in the challenge code and a preset data generation algorithm;
and generating target data according to the target fragment abstract and the data generation algorithm, comparing the verification data with the target data, and judging that the fragment file is correct if the target data is consistent with the verification data.
A second aspect of the embodiments of the present invention provides a terminal device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the following steps when executing the computer program:
selecting data with preset data size at the positions of at least two target files in the fragmented files to be verified as target fragmented summaries;
randomly combining all the target file positions to obtain a challenge code in a character string form, and sending the challenge code to a storage node for storing the fragment file;
receiving verification data sent by the storage node; the verification data is generated according to the target file position in the challenge code and a preset data generation algorithm;
and generating target data according to the target fragment abstract and the data generation algorithm, comparing the verification data with the target data, and judging that the fragment file is correct if the target data is consistent with the verification data.
A third aspect of an embodiment of the present invention provides a terminal device, including:
the device comprises a selection unit, a verification unit and a verification unit, wherein the selection unit is used for selecting data with preset data volume at the positions of at least two target files in the fragmented files to be verified as target fragmented abstracts;
the sending unit is used for randomly combining all the target file positions to obtain a challenge code in a character string form and sending the challenge code to a storage node for storing the fragmented file;
the receiving unit is used for receiving the verification data sent by the storage node; the verification data is generated according to the target file position in the challenge code and a preset data generation algorithm;
and the comparison unit is used for generating target data according to the target fragment abstract and the data generation algorithm, comparing the verification data with the target data, and judging that the fragment file is correct if the target data is consistent with the verification data.
A fourth aspect of embodiments of the present invention provides a computer-readable storage medium having stored thereon a computer program comprising program instructions which, when executed by a processor, cause the processor to perform the method of the first aspect described above.
Compared with the prior art, the embodiment of the invention has the following beneficial effects:
the embodiment of the invention selects the data with preset data size at the positions of at least two target files in the fragmented files to be verified as the target fragmented abstract; randomly combining all the target file positions to obtain a challenge code in a character string form, and sending the challenge code to a storage node for storing the fragment file; receiving verification data sent by the storage node; the verification data is generated according to the target file position in the challenge code and a preset data generation algorithm; and generating target data according to the target fragment abstract and the data generation algorithm, comparing the verification data with the target data, and judging that the fragment file is correct if the target data is consistent with the verification data. The challenge code is generated by the fragment abstract of the fragment file to carry out fragment verification, so that the data processing amount and the transmission amount are reduced, and the data holding verification efficiency is improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.
Fig. 1 is a flowchart of a fragmented file authentication method according to an embodiment of the present invention;
fig. 2 is a flowchart of a fragmented file authentication method according to a second embodiment of the present invention;
fig. 3 is a schematic diagram of a terminal device according to a third embodiment of the present invention;
fig. 4 is a schematic diagram of a terminal device according to a fourth embodiment of the present invention.
Detailed Description
In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the invention. It will be apparent, however, to one skilled in the art that the present invention may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present invention with unnecessary detail.
In order to explain the technical means of the present invention, the following description will be given by way of specific examples.
Referring to fig. 1, fig. 1 is a flowchart of a fragmented file authentication method provided in an embodiment of the present invention. The execution main body of the fragmented file verification method in the embodiment is a terminal. The terminal includes but is not limited to mobile terminals such as smart phones, tablet computers, wearable devices and the like, and can also be desktop computers and the like. The fragmented file verification method as shown in the figure can include the following steps:
s101: and selecting data with preset data size at the positions of at least two target files in the fragmented files to be verified as target fragmented summaries.
The slicing is to divide a complete data into different parts according to certain conditions, and different servers store the divided contents, wherein each content is called a slice. For the outside world, it is obviously not desirable to know where the data comes from and how many pieces are divided, because for the application, a complete copy of the data needs to be seen, and it is not desirable to mix the problem of where to take the data in the business logic of the application, which is irrelevant to the business. So that the data is stored and processed physically separately while being fragmented, and is logically a complete copy.
In this embodiment, a complete source file may be composed of a plurality of fragmented data after being fragmented, and these fragmented data are stored in different storage nodes to reduce the load of the source storage node. However, the storage node may delete or modify the fragment data stored locally in order to improve the storage efficiency, which may cause data errors and may not perform data processing normally.
In this embodiment, it is verified that the correctness of one sharded file stored in one storage node, and a plurality of sharded digests may be stored in one sharded file. In the scheme, data with a preset data volume is extracted as a fragment abstract according to the storage address of each piece of data in the fragment file, and in order to ensure the objectivity and comprehensiveness of abstract extraction and verification, at least two fragment abstracts can be extracted in advance, and the fragment abstracts and the storage positions of the fragment abstracts in the fragment file are correspondingly stored.
When data verification is performed, a fragmented file to be verified is determined, for example, a complete source file can be divided into a plurality of fragmented files, each fragmented file is stored in a different storage node, and data verification is performed by determining one fragmented file in one storage node. The determination method may be that random extraction is performed according to the file identifier of each fragmented file to determine the fragmented file to be verified. Or selecting the fragmented file with the last verification time being the longest from the current time as the fragmented file to be verified according to the last verification time of each fragmented file.
In a fragment file, a plurality of fragment digests may be stored, and at least two fragment digests are randomly extracted as target fragment digests. In this embodiment, each data in a fragment file has its file location in the file, and we select at least two target file locations according to the file location of each fragment summary in the fragment file, and determine that the fragment summaries at the two target file locations are target fragment summaries. It should be noted that when the fragment abstract is extracted, data with a preset data size at the position of the target file is used as the target fragment abstract, so as to prevent single data from causing one-sidedness to the abstract verification, which leads to an inaccurate verification result.
Furthermore, the position of the target file can be selected in real time aiming at the fragment file, and the target fragment abstract can be extracted in real time. The file positions and the corresponding fragment abstracts of the preset number can be extracted first, and the abstract information is stored, so that the target file positions and the corresponding target fragment abstracts of the preset number can be randomly selected from the prestored abstract information when data verification is needed later.
For example, we fragment a large file, and average 16M data amount per fragment data as a fragment file, we randomly and uniformly determine 20 positions in the fragment file as positions for extracting fragment digests, and each position takes 2 bytes of data as a fragment digest, and stores the 20 addresses and the fragment digests thereof. In the above example, each data occupies 2 bytes of data, and the corresponding address of each data occupies 3 bytes of data, so that a total of: the data storage amount of 20 × 2+20 × 3 is 100 bytes, and the 100 bytes are stored as fragment data in a management node or a terminal of a data holder.
S102: and randomly combining all the target file positions to obtain a challenge code in a character string form, and sending the challenge code to a storage node for storing the fragment file.
After the target file position and the target fragment abstract are determined, a challenge code is generated according to the target file position combination, and the challenge code is sent to a storage node storing the fragment node, so that a verification challenge is sent to the storage node through the challenge code.
Optionally, when the challenge code is generated, all target file positions may be randomly combined, and an algorithm name for the combined data is added to obtain a character string, and the character string is used as the challenge code to send the challenge code to the storage node for data completeness detection.
For example, when storing the target fragment abstract and the target file position thereof, the storage mode is a mode of storing MAP by using a lookup table: h (k1) ═ v1, h (k2) ═ v2, …, and h (k20) ═ v20, where k1 denotes the first file position and v1 denotes the shard digest corresponding to the first file position. We randomly select 5 file locations as target file locations: k3, k6, k7, k8, k20, we can combine these addresses in any order, such as: k6+ k8+ k7+ k20+ k3, and at the same time, add the character name "SHA-l" of the processing algorithm for the combined address, for example, the hash algorithm, and each target file location is 3 bytes, and finally we get the length of the challenge code as: 3 × 5+1 ═ 16 bytes. Therefore, the length of the challenge is only 16 bytes before encoding, after the challenger receives the challenge, 15 bytes are needed to read the address field as the address to read the content of the stored data, and then verification data is generated according to a specified algorithm and is replied to the challenger terminal.
By the mode of randomly selecting the position of the target file to generate the challenge code, the fragmented data stored by the storage node is verified, the problems that the data calculation amount is large, the data storage amount of the obtained abstract data is large and the transmission speed is slow in transmission in the traditional data verification can be solved, the verification speed and the verification accuracy of the fragmented data can be improved, and the normal work of the storage node is not influenced.
S103: receiving verification data sent by the storage node; and the verification data is generated according to the target file position in the challenge code and a preset data generation algorithm.
After receiving the challenge code, the storage node can analyze the challenge code to obtain a target file position in the challenge code through a preset format of the challenge code, locally read a corresponding target fragment abstract through the target file position, perform corresponding algorithm processing on the analyzed target fragment abstract according to an algorithm in the analyzed challenge code, and generate verification data.
Illustratively, in connection with the example in step S102, we will refer to the target file location: k3, k6, k7, k8, k20, along with the challenge code generated by the algorithm name "SHA-l" of the composite character, are sent to the storage nodes. After the storage node receives the challenge code, the first 15 bytes in the challenge code are target file positions, the second byte is a corresponding algorithm name, 5 target file positions included in the 15 bytes can be determined through the number of the target file positions agreed in advance, then the summary data corresponding to each file position are read from the local through the 5 target file positions, and all the summary data are calculated through the algorithms corresponding to the algorithm names in the challenge code, so that verification data are obtained.
After the storage node generates the verification data, the verification data is sent to the terminal of the data owner to trigger the terminal of the data owner to check according to the verification data, and a conclusion whether the fragment data stored in the storage node is correct is obtained.
S104: and generating target data according to the target fragment abstract and the data generation algorithm, comparing the verification data with the target data, and judging that the fragment file is correct if the target data is consistent with the verification data.
The terminal of the data owner generates target data according to the target fragment abstract and the data generation algorithm, the mode of generating the target data is the same as the mode of generating verification data by the storage node, the target fragment data are combined firstly, and then the combined data are generated into the target data according to the data generation algorithm. It should be noted that the data generation Algorithm and the Algorithm for generating the verification data are the same, but are not limited to the Secure Hash Algorithm (SHA-l) in the above example, and may also be other Digest algorithms, such as a Secure Hash Algorithm (SHA-256), a Message Digest Algorithm (Message-Digest Algorithm, MD5), and the like, and are not limited herein.
After the target data are generated, comparing the verification data with the target data, and if the target data are consistent with the verification data, judging that the fragmented files are correct; and if the two data are inconsistent, judging that the fragment file is not correct.
Furthermore, in order to ensure the correctness and accuracy of the verification result, more target file positions and more target fragment abstracts corresponding to the target file positions can be determined, so that more complete target data and verification data can be obtained, and the fragment files can be verified more comprehensively. However, it should be noted that, although the selection of a large amount of summary data is beneficial to the accuracy of the verification result, the amount of data to be calculated and transmitted is also increased, and therefore, a trade-off problem is required between the two, and it is preferable to ensure both the low amount of data to be calculated and transmitted and the accuracy of the data verification.
According to the scheme, data with preset data volume at the positions of at least two target files in the fragmented files to be verified are selected as target fragmented summaries; randomly combining all the target file positions to obtain a challenge code in a character string form, and sending the challenge code to a storage node for storing the fragment file; receiving verification data sent by the storage node; the verification data is generated according to the target file position in the challenge code and a preset data generation algorithm; and generating target data according to the target fragment abstract and the data generation algorithm, comparing the verification data with the target data, and judging that the fragment file is correct if the target data is consistent with the verification data. The challenge code is generated by the fragment abstract of the fragment file to carry out fragment verification, so that the data processing amount and the transmission amount are reduced, and the data holding verification efficiency is improved.
Referring to fig. 2, fig. 2 is a flowchart of a fragmented file authentication method according to a second embodiment of the present invention. The execution main body of the fragmented file verification method in the embodiment is a terminal. The terminal includes but is not limited to mobile terminals such as smart phones, tablet computers, wearable devices and the like, and can also be desktop computers and the like. The fragmented file verification method as shown in the figure can include the following steps:
s201: and acquiring original data to be stored in a slicing mode.
With conventional distributed systems, it is not uncommon to build servers in different areas and then store data on the servers. It solves some centralized storage problems, but also has problems such as server becoming bottleneck, access inconvenience due to bandwidth, etc. Thus, P2P distributed storage arises. Peer-to-Peer (P2P) distributed storage is to let the client also become a server, and when storing data, it also provides space for others to store. This provides a good solution to the bottleneck created by the small number of servers and also allows speed improvements. But it also brings many problems, such as data stability, consistency, security, privacy and anti-attack are more or less affected. The embodiment mainly aims at the problem of data integrity, because in many cases, data with a large data volume is stored in a P2P node, the node cannot guarantee the security, privacy and integrity of the data, and the integrity is a more important data attribute than the privacy, if the integrity is threatened, the data processing system does not have a complete and safe data operation basis, and under the condition that the P2P node is easily attacked or storage and processing failures occur, the integrity of the data in the P2P node storing the source data currently needs to be detected in time. And verifying whether the data stored by the current node is the same as the source data and whether all original data are completely stored by adding a salinity value to the data in the P2P node. In the scheme, the source data is used for representing the most initial data, namely standard data of data storage, the data are stored in a local server, and the data are compared with the data stored in the P2P node to check the correctness of the data in the P2P node.
In this embodiment, the data owner is used to indicate the owner and user of the source data, and the data owner may process and send data, but may send own data to other storage nodes due to the large data volume of the source data, and therefore, in this embodiment, the storage node is a node used to store fragmented data of the original data of the data owner.
The original data in this embodiment is used to represent a large file before the non-fragmentation, and since the file is large, the data processing, storage, and transmission pressure is large, and therefore, the file is stored in different storage nodes in a fragmentation manner.
The method for acquiring the original data in the scheme can be acquired through wired transmission or wireless transmission, the acquisition time can be acquired after the original data is generated, or the acquisition time can be acquired while the original data is generated, and after the generation process of the original data is finished, the acquisition process of the original data is also finished, and here, the acquisition method and the acquisition time of the original data to be stored in the fragmentation mode are not limited.
S202: and fragmenting the original data according to a preset fragmentation mode to obtain a fragmentation file, and sending the fragmentation file to a storage node for storage.
After the original data is acquired, because the occupied space of the original data storage is large, the original data is subjected to fragmentation processing, so that fragmented files are stored through different storage nodes.
Further, step S202 in this embodiment may specifically include S2021 to S2024:
s2021: and determining data information of the original data, and determining the storage space capacity occupied by each fragment file corresponding to the original data according to the data information.
After the original data is obtained, according to data information of the original data, storage space capacity occupied by each fragment file corresponding to the original data is determined. Fragmentation is the basic unit of distributed data distribution, and different types of applications have different fragmentation modes and strategies. With the fragmentation distributing data over different physical devices, more devices are introduced to serve the same data. Regardless of the probability, each device has a certain chance to fail.
Assuming that the chance of each machine failing is the same, the number of machines randomly participating in the service increases, and the chance of the failure of the whole cluster increases with the number of devices participating in the service, so for the cluster as a whole, the probability of the failure is the product of the probability of the failure and the device data. That is, if any device in the cluster fails, the whole cluster is in an unhealthy state, thereby affecting the external services. Redundancy, also referred to herein as duplication, is required to address this problem. It aims at providing service to the outside world by replacing the position of a device in the cluster with the same data device when the device in the cluster goes wrong, so that the whole cluster still looks healthy from the outside world. Of course, the existence of the copy also provides an additional use, namely, read-write separation.
Further, step S2021 may specifically include steps S20211 to S20212:
s20211: and determining the data importance, the data freshness and the data volume of the original data.
The data information in this embodiment may include data importance, data freshness, and data size, which are all preset. The data importance is used for measuring the data importance of original data, for example, the data can be divided into a first level, a second level and a third level according to the importance level; the data freshness is used for measuring the time difference between the time of generating original data and the current time, when client data is newer, the data can be determined to be used more times, but when the data is older, the data is quite stale, and the stale data can be divided into larger fragment files because the calling frequency is very low; the data size is used to indicate the storage space occupied by one original data, and when the data size is larger, more fragmentation numbers may be needed to perform balanced fragmentation and storage.
S20212: and determining the occupied space of the fragment file of the original data according to the data importance, the data freshness and the data volume.
After determining the data importance, the data freshness and the data volume of the original data, determining the occupied space of the fragment files of the original data according to the data importance, the data freshness and the data volume.
Specifically, assuming that the data importance, the data freshness and the data volume are v, t and d respectively, we can determine the amount of space occupied by the fragmented file as follows by using a simple formula: occ ═ d/(v · t).
S2022: and fragmenting the original data according to the capacity of the storage space to obtain fragmented files and the number of fragments.
After the occupied space of the fragment file is determined, the original data is fragmented according to the occupied space, and the fragmentation mode can be that the original data is divided evenly according to the size of the occupied space to obtain the fragment file and the corresponding number of fragments.
In addition, fragmentation can be randomly performed according to the file format of the original data and the file position in the original data and the occupied space of the fragment file, and after the fragmentation is performed according to the file position, the remaining data is integrated into the fragment file.
S2023: and acquiring the operating parameters of each storage node in the network, and determining the target storage nodes with the same number as the number of the fragments according to the operating parameters.
After the fragment file is generated, the fragment file needs to be stored in the corresponding storage node, but the operation conditions of the storage nodes in the network are different, and the target storage nodes with the same number as the number of the fragments are determined according to the operation parameters by acquiring the operation parameters of each storage node in the network. The operation parameters may include data such as memory occupancy rate and storage space occupancy rate of the storage node at the current time, and the storage node with the smaller memory occupancy rate and storage space occupancy rate is selected as the target storage node.
S2024: and sending each fragment file to a corresponding target storage node.
After determining the target storage node, we send the fragmented file to the corresponding target storage node. It should be noted that, in the present solution, the number of target storage nodes is the same as the number of fragmented files, and the fragmented files are efficiently and orderly sent to and stored in the corresponding storage files.
In addition, the number of storage nodes may be smaller than the number of fragmented files, and the storage nodes are used to store two or more fragmented files in one storage node, which is applicable to a case where the number of storage nodes in a network is small, or the number of available storage nodes for storing fragmented files is small, and we can improve the utilization rate of the storage node by storing at least two fragmented files in one storage node.
S203: and randomly extracting data with preset quantity and preset occupied space from the fragment files as fragment abstracts of the fragment files according to the file position of each data in the fragment files.
After the fragmented file is stored in the target storage node, randomly extracting a preset amount of data from the fragmented file as a fragmented abstract according to the file position of each data in the fragmented file, wherein it should be noted that the extracted data is not data of one byte, but data of a preset occupied space.
S204: and storing the fragment summaries according to the file positions of the fragment summaries in the fragment files and the mode of storing MAP by a lookup table.
After the fragment digests at the preset number of file positions are obtained, the file addresses of the preset data volume and the data combinations at the corresponding positions are obtained. And storing the fragment summaries according to the file positions of each fragment summary in the fragment files and the mode of storing the MAP by the lookup table.
For example, when storing the target fragment summary and the target file position thereof, the storage mode is a MAP storage mode: h (k1) ═ v1, h (k2) ═ v2, …, and h (k20) ═ v20, where k1 denotes the first file position and v1 denotes the shard digest corresponding to the first file position.
S205: and selecting data with preset data size at the positions of at least two target files in the fragmented files to be verified as target fragmented summaries.
In this embodiment, the implementation manner of S205 is completely the same as that of S101 in the embodiment corresponding to fig. 1, and reference may be specifically made to the related description of S101 in the embodiment corresponding to fig. 1, which is not repeated herein.
S206: and randomly combining all the target file positions to obtain a challenge code in a character string form, and sending the challenge code to a storage node for storing the fragment file.
In this embodiment, the implementation manner of S206 is completely the same as that of S102 in the embodiment corresponding to fig. 1, and reference may be specifically made to the related description of S102 in the embodiment corresponding to fig. 1, which is not repeated herein.
S207: receiving verification data sent by the storage node; and the verification data is generated according to the target file position in the challenge code and a preset data generation algorithm.
In this embodiment, the implementation manner of S207 is completely the same as that of S103 in the embodiment corresponding to fig. 1, and reference may be specifically made to the related description of S103 in the embodiment corresponding to fig. 1, which is not repeated herein.
S208: and generating target data according to the target fragment abstract and the data generation algorithm, comparing the verification data with the target data, and judging that the fragment file is correct if the target data is consistent with the verification data.
The terminal of the data owner generates target data according to the target fragment abstract and the data generation algorithm, the mode of generating the target data is the same as the mode of generating verification data by the storage node, the target fragment data are combined firstly, and then the combined data are generated into the target data according to the data generation algorithm. It should be noted that the data generation algorithm and the verification data generation algorithm are the same, but the algorithm is not limited to the SHA algorithm in the above example, and may be other digest algorithms, such as SHA256, MD5, and the like, and the algorithm is not limited herein. After the target data is generated, the verification data is compared with the target data, and if the target data is consistent with the verification data, the fragmented file is judged to be correct,
Further, the present embodiment may further include steps S2081 to S2083:
s2081: and if the target data is inconsistent with the verification data, judging that the fragment file is incorrect.
And after the target data is generated, comparing the verification data with the target data, and if the two data are inconsistent, judging that the fragment file is incorrect. The storage node may modify or delete the stored fragmented data itself, or the storage node may fail to store the data, which may cause inconsistency between the target data and the verification data.
S2082: and determining the data with inconsistency and the file position of the data in the fragment file according to the verification data and the target data.
After the fragment file is determined to be incorrect, the data with the inconsistency and the file position of the data in the fragment file are determined according to the verification data and the target data.
Specifically, when the inconsistent data and the file position thereof are determined, the target data is used as a reference by comparing the verification data with the target data, and the data in the fragmented file corresponding to the data summary in the inconsistent verification data and the file position of the inconsistent data in the fragmented file are determined by comparing.
Further, since it is highly required to determine the accuracy of only data in which inconsistency occurs, we can determine the file region in which the data in which inconsistency occurs. By determining the inconsistent file positions regionally, the difficulty of searching for errors can be reduced, and the efficiency of data error correction is improved.
S2083: and according to the file positions of the inconsistent data in the fragmented files, determining correct data corresponding to the inconsistent data at the file positions in the fragmented files from original data, and sending the correct data to the storage node for data replacement.
After the file position of the data with inconsistency in the fragment file is determined, according to the file position, correct data corresponding to the file position of the data with inconsistency in the fragment file is determined from the original data, the correct data is sent to the storage node for data replacement, and the most correct or latest data is stored in the storage node.
Further, since the fragmented data in the storage node has an error, the data owner has the right to investigate the reason of the data error of the storage node, so as to perform corresponding processing on the storage node through the reason, for example, limiting the authority of the storage node.
According to the scheme, original data to be stored in a slicing mode are obtained; fragmenting the original data according to a preset fragmentation mode to obtain a fragmentation file, and sending the fragmentation file to a storage node for storage; randomly extracting data with preset quantity and preset occupied space from the fragment files as fragment abstracts of the fragment files according to the file position of each data in the fragment files; and storing the fragment summaries according to the file positions of the fragment summaries in the fragment files and the mode of storing MAP by a lookup table. Selecting data with preset data size at the positions of at least two target files in the fragmented files to be verified as target fragmented summaries; randomly combining all the target file positions to obtain a challenge code in a character string form, and sending the challenge code to a storage node for storing the fragment file; receiving verification data sent by the storage node; the verification data is generated according to the target file position in the challenge code and a preset data generation algorithm; and generating target data of a target fragment file abstract verification method according to the target fragment abstract and the data generation algorithm, comparing the verification data with the target data, and judging that the fragment file is correct if the target data is consistent with the verification data. The method comprises the steps of determining original data to be stored in a fragmentation mode, fragmenting the original data according to a preset fragmentation mode to obtain fragmentation data, sending the fragmentation data to corresponding storage nodes, generating corresponding fragmentation abstracts according to each fragmentation data, storing all the fragmentation abstracts, and carrying out holding verification on the fragmentation data according to the fragmentation abstracts, so that the efficiency of data holding verification is improved.
Referring to fig. 3, fig. 3 is a schematic diagram of a terminal device according to a third embodiment of the present invention. The terminal device includes units for executing the steps in the embodiments corresponding to fig. 1 to fig. 2. Please refer to the related description of the embodiments in fig. 1-2. For convenience of explanation, only the portions related to the present embodiment are shown. The terminal device 300 of the present embodiment includes:
a selecting unit 301, configured to select data of a preset data size at least two target file positions in a fragmented file to be verified as a target fragmented summary;
a sending unit 302, configured to randomly combine the positions of all the target files to obtain a challenge code in a character string form, and send the challenge code to a storage node that stores the fragmented file;
a receiving unit 303, configured to receive verification data sent by the storage node; the verification data is generated according to the target file position in the challenge code and a preset data generation algorithm;
a comparing unit 304, configured to generate target data according to the target fragment digest and the data generating algorithm, compare the verification data with the target data, and determine that the fragment file is correct if the target data is consistent with the verification data.
Further, the terminal device may further include:
the acquisition unit is used for acquiring original data to be stored in a slicing mode;
the fragmentation unit is used for fragmenting the original data according to a preset fragmentation mode to obtain a fragmentation file and sending the fragmentation file to a storage node for storage;
the extraction unit is used for randomly extracting data with preset quantity and preset occupied space from the fragment files as fragment abstracts of the fragment files according to the file position of each data in the fragment files;
and the storage unit is used for storing the fragment summaries according to the file positions of the fragment summaries in the fragment files and the mode of storing MAP (MAP) by a lookup table.
Further, the slicing unit may include:
the information determining unit is used for determining data information of the original data and determining the storage space capacity occupied by each fragment file corresponding to the original data according to the data information;
the file fragmentation unit is used for fragmenting the original data according to the capacity of the storage space to obtain fragmented files and the number of fragments;
the node determining unit is used for acquiring the operation parameters of each storage node in the network and determining the target storage nodes with the same number as the number of the fragments according to the operation parameters;
and the file sending unit is used for sending each fragment file to a corresponding target storage node.
Further, the terminal device may further include:
a determination unit, configured to determine that the fragmented file is incorrect if the target data is inconsistent with the verification data;
the position determining unit is used for determining inconsistent data and file positions of the inconsistent data in the fragmented files according to verification data and the target data;
and the data replacement unit is used for determining correct data corresponding to the inconsistent data at the file position in the fragment file from the original data according to the file position of the inconsistent data in the fragment file, and sending the correct data to the storage node for data replacement.
Further, the information determination unit may include:
the first determining unit is used for determining the data importance, the data freshness and the data volume of the original data;
and the second determining unit is used for determining the occupied space of the slicing file of the original data according to the data importance, the data freshness and the data volume.
According to the scheme, data with preset data volume at the positions of at least two target files in the fragmented files to be verified are selected as target fragmented summaries; randomly combining all the target file positions to obtain a challenge code in a character string form, and sending the challenge code to a storage node for storing the fragment file; receiving verification data sent by the storage node; the verification data is generated according to the target file position in the challenge code and a preset data generation algorithm; and generating target data according to the target fragment abstract and the data generation algorithm, comparing the verification data with the target data, and judging that the fragment file is correct if the target data is consistent with the verification data. The challenge code is generated by the fragment abstract of the fragment file to carry out fragment verification, so that the data processing amount and the transmission amount are reduced, and the data holding verification efficiency is improved.
Fig. 4 is a schematic diagram of a terminal device according to a fourth embodiment of the present invention. As shown in fig. 4, the terminal device 4 of this embodiment includes: a processor 40, a memory 41 and a computer program 42 stored in said memory 41 and executable on said processor 40. The processor 40, when executing the computer program 42, implements the steps in the above-described embodiments of the fragmented file authentication method, such as the steps 101 to 104 shown in fig. 1. Alternatively, the processor 40, when executing the computer program 42, implements the functions of the modules/units in the above-mentioned device embodiments, such as the functions of the units 301 to 304 shown in fig. 3.
Illustratively, the computer program 42 may be partitioned into one or more modules/units that are stored in the memory 41 and executed by the processor 40 to implement the present invention. The one or more modules/units may be a series of computer program instruction segments capable of performing specific functions, which are used to describe the execution process of the computer program 42 in the terminal device 4.
The terminal device 4 may be a desktop computer, a notebook, a palm computer, a cloud server, or other computing devices. The terminal device may include, but is not limited to, a processor 40, a memory 41. Those skilled in the art will appreciate that fig. 4 is merely an example of a terminal device 4 and does not constitute a limitation of terminal device 4 and may include more or fewer components than shown, or some components may be combined, or different components, e.g., the terminal device may also include input-output devices, network access devices, buses, etc.
The Processor 40 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory 41 may be an internal storage unit of the terminal device 4, such as a hard disk or a memory of the terminal device 4. The memory 41 may also be an external storage device of the terminal device 4, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card, FC), and the like provided on the terminal device 4. Further, the memory 41 may also include both an internal storage unit and an external storage device of the terminal device 4. The memory 41 is used for storing the computer program and other programs and data required by the terminal device. The memory 41 may also be used to temporarily store data that has been output or is to be output.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working processes of the units and modules in the system may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
The integrated modules/units, if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium. Based on such understanding, all or part of the flow of the method according to the above embodiments may be implemented by a computer program, which may be stored in a computer-readable storage medium, to instruct related hardware.
The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present invention, and are intended to be included within the scope of the present invention.

Claims (10)

1. A method for verifying fragmented files, comprising:
selecting data with preset data size at the positions of at least two target files in the fragmented files to be verified as target fragmented summaries;
randomly combining all the target file positions to obtain a challenge code in a character string form, and sending the challenge code to a storage node for storing the fragment file;
receiving verification data sent by the storage node; the verification data is generated according to target fragment abstract data at the position of a target file in the challenge code and a preset data generation algorithm;
and generating target data according to the target fragment abstract and the data generation algorithm, comparing the verification data with the target data, and judging that the fragment file is correct if the target data is consistent with the verification data.
2. The method for validating fragmented files according to claim 1, wherein before selecting the data of the preset data size at the positions of at least two target files in the fragmented file to be validated as the target fragment digest, the method further comprises:
acquiring original data to be stored in a fragmentation mode;
fragmenting the original data according to a preset fragmentation mode to obtain a fragmentation file, and sending the fragmentation file to a storage node for storage;
randomly extracting data with preset quantity and preset occupied space from the fragment files as fragment abstracts of the fragment files according to the file position of each data in the fragment files;
and storing the fragment summaries according to the file positions of the fragment summaries in the fragment files and the mode of storing MAP by a lookup table.
3. The method for verifying the fragmented file according to claim 2, wherein the fragmenting the original data according to a preset fragmentation mode to obtain a fragmented file, and sending the fragmented file to a storage node for storage includes:
determining data information of the original data, and determining the storage space capacity occupied by each fragment file corresponding to the original data according to the data information;
fragmenting the original data according to the capacity of the storage space to obtain fragmented files and the number of fragments;
acquiring operation parameters of each storage node in a network, and determining target storage nodes with the same number as the number of the fragments according to the operation parameters;
and sending each fragment file to a corresponding target storage node.
4. The fragmented file validation method of any of claims 1-3, wherein the method further comprises:
if the target data is inconsistent with the verification data, judging that the fragment file is incorrect;
according to verification data and the target data, determining inconsistent data and file positions of the inconsistent data in the fragmented files;
and according to the file position of the data with the inconsistency in the fragment file, determining correct data corresponding to the data with the inconsistency at the file position in the fragment file from original data, and sending the correct data to the storage node for data replacement.
5. The method for verifying the fragmented file according to claim 3, wherein the determining the data information of the original data and determining the storage space capacity occupied by the fragmented file corresponding to the original data according to the data information includes:
determining the data importance, the data freshness and the data volume of the original data;
and determining the occupied space of the fragment file of the original data according to the data importance, the data freshness and the data volume.
6. A terminal device comprising a memory and a processor, the memory having stored therein a computer program operable on the processor, wherein the processor, when executing the computer program, implements the steps of:
selecting data with preset data size at the positions of at least two target files in the fragmented files to be verified as target fragmented summaries;
randomly combining all the target file positions to obtain a challenge code in a character string form, and sending the challenge code to a storage node for storing the fragment file;
receiving verification data sent by the storage node; the verification data is generated according to target fragment abstract data at the position of a target file in the challenge code and a preset data generation algorithm;
and generating target data according to the target fragment abstract and the data generation algorithm, comparing the verification data with the target data, and judging that the fragment file is correct if the target data is consistent with the verification data.
7. The terminal device of claim 6, wherein before selecting the data of the preset data size at the positions of at least two target files in the fragmented file to be verified as the target fragment digest, the method further comprises:
acquiring original data to be stored in a fragmentation mode;
fragmenting the original data according to a preset fragmentation mode to obtain a fragmentation file, and sending the fragmentation file to a storage node for storage;
randomly extracting data with preset quantity and preset occupied space from the fragment files as fragment abstracts of the fragment files according to the file position of each data in the fragment files;
and storing the fragment summaries according to the file positions of the fragment summaries in the fragment files and the mode of storing MAP by a lookup table.
8. The terminal device of claim 7, wherein the fragmenting the original data in a preset fragmentation manner to obtain a fragmented file, and sending the fragmented file to a storage node for storage includes:
determining data information of the original data, and determining the storage space capacity occupied by each fragment file corresponding to the original data according to the data information;
fragmenting the original data according to the capacity of the storage space to obtain fragmented files and the number of fragments;
acquiring operation parameters of each storage node in a network, and determining target storage nodes with the same number as the number of the fragments according to the operation parameters;
and sending each fragment file to a corresponding target storage node.
9. A terminal device, comprising:
the device comprises a selection unit, a verification unit and a verification unit, wherein the selection unit is used for selecting data with preset data volume at the positions of at least two target files in the fragmented files to be verified as target fragmented abstracts;
the sending unit is used for randomly combining all the target file positions to obtain a challenge code in a character string form and sending the challenge code to a storage node for storing the fragmented file;
the receiving unit is used for receiving the verification data sent by the storage node; the verification data is generated according to target fragment abstract data at the position of a target file in the challenge code and a preset data generation algorithm;
and the comparison unit is used for generating target data according to the target fragment abstract and the data generation algorithm, comparing the verification data with the target data, and judging that the fragment file is correct if the target data is consistent with the verification data.
10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 5.
CN201910014713.6A 2019-01-08 2019-01-08 Fragmented file verification method and terminal equipment Active CN109831487B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201910014713.6A CN109831487B (en) 2019-01-08 2019-01-08 Fragmented file verification method and terminal equipment
PCT/CN2019/118145 WO2020143317A1 (en) 2019-01-08 2019-11-13 Fragmented file verification method and terminal device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910014713.6A CN109831487B (en) 2019-01-08 2019-01-08 Fragmented file verification method and terminal equipment

Publications (2)

Publication Number Publication Date
CN109831487A CN109831487A (en) 2019-05-31
CN109831487B true CN109831487B (en) 2022-05-13

Family

ID=66860116

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910014713.6A Active CN109831487B (en) 2019-01-08 2019-01-08 Fragmented file verification method and terminal equipment

Country Status (2)

Country Link
CN (1) CN109831487B (en)
WO (1) WO2020143317A1 (en)

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109831487B (en) * 2019-01-08 2022-05-13 平安科技(深圳)有限公司 Fragmented file verification method and terminal equipment
CN111031110B (en) * 2019-11-29 2023-01-24 山东英信计算机技术有限公司 File uploading method and device, electronic equipment and storage medium
CN111176567B (en) * 2019-12-25 2023-11-03 上海新沄信息科技有限公司 Storage supply verification method and device for distributed cloud storage
CN112667623A (en) * 2021-01-13 2021-04-16 张立旭 Random algorithm-based distributed storage data error correction method and system
CN113076283B (en) * 2021-04-06 2022-02-18 中移(上海)信息通信科技有限公司 File consistency verification method and device and electronic equipment
CN113726838B (en) * 2021-06-17 2023-09-19 武汉理工数字传播工程有限公司 File transmission method, device, equipment and storage medium
CN113407492B (en) * 2021-06-18 2024-03-26 中国人民银行清算总中心 Method and device for storing file fragments and reorganizing file fragments and file protection system
CN113590994A (en) * 2021-08-02 2021-11-02 北京金山云网络技术有限公司 Data processing method, data processing device, computer equipment and storage medium
CN115729918A (en) * 2021-08-31 2023-03-03 华为技术有限公司 Data fragmentation method and device and electronic equipment
CN114567496B (en) * 2022-03-03 2024-02-20 浪潮云信息技术股份公司 Method and system for checking integrity of cloud server mirror image
CN114760068A (en) * 2022-04-08 2022-07-15 中国银行股份有限公司 User identity authentication method, system, electronic device and storage medium
CN115811411A (en) * 2022-05-16 2023-03-17 浪潮软件股份有限公司 Tamper-proof information transmission method, system, device and computer readable medium
CN116233120B (en) * 2023-05-10 2023-07-14 深圳普菲特信息科技股份有限公司 Large file fragment transmission method, system and medium based on data processing

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106033427A (en) * 2015-03-11 2016-10-19 阿里巴巴集团控股有限公司 A sampling data verification method and device
CN108664221A (en) * 2018-05-11 2018-10-16 北京奇虎科技有限公司 A kind of data proof of possession method, apparatus and readable storage medium storing program for executing
CN109104449A (en) * 2017-06-21 2018-12-28 北京大学 A kind of more Backup Data property held methods of proof under cloud storage environment

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20140141348A (en) * 2013-05-31 2014-12-10 삼성전자주식회사 Storage system and Method for performing deduplication in conjunction with host device and storage device
US20170134162A1 (en) * 2015-11-10 2017-05-11 Shannon Code System and process for verifying digital media content authenticity
US20180219871A1 (en) * 2017-02-01 2018-08-02 Futurewei Technologies, Inc. Verification of fragmented information centric network chunks
CN108737109A (en) * 2018-05-11 2018-11-02 北京奇虎科技有限公司 Data proof of possession method, apparatus and system
CN109831487B (en) * 2019-01-08 2022-05-13 平安科技(深圳)有限公司 Fragmented file verification method and terminal equipment

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106033427A (en) * 2015-03-11 2016-10-19 阿里巴巴集团控股有限公司 A sampling data verification method and device
CN109104449A (en) * 2017-06-21 2018-12-28 北京大学 A kind of more Backup Data property held methods of proof under cloud storage environment
CN108664221A (en) * 2018-05-11 2018-10-16 北京奇虎科技有限公司 A kind of data proof of possession method, apparatus and readable storage medium storing program for executing

Also Published As

Publication number Publication date
CN109831487A (en) 2019-05-31
WO2020143317A1 (en) 2020-07-16

Similar Documents

Publication Publication Date Title
CN109831487B (en) Fragmented file verification method and terminal equipment
CN108446407B (en) Database auditing method and device based on block chain
CN109889505B (en) Data consistency verification method and terminal equipment
US10747721B2 (en) File management/search system and file management/search method based on block chain
CN108319719B (en) Database data verification method and device, computer equipment and storage medium
CN111163182B (en) Block chain-based device registration method and apparatus, electronic device, and storage medium
US20180004852A1 (en) Method and system for facilitating terminal identifiers
CN110543448A (en) data synchronization method, device, equipment and computer readable storage medium
CN112907369B (en) Block chain-based data consensus method and device, electronic equipment and storage medium
US20210158353A1 (en) Methods, systems, apparatuses, and devices for processing request in consortium blockchain
CN111262822B (en) File storage method, device, block link point and system
CN109145651B (en) Data processing method and device
CN111597567A (en) Data processing method, data processing device, node equipment and storage medium
CN111367923A (en) Data processing method, data processing device, node equipment and storage medium
CN112069169A (en) Block data storage method and device, electronic equipment and readable storage medium
CN111339551B (en) Data verification method and related device and equipment
US20140279946A1 (en) System and Method for Automatic Integrity Checks in a Key/Value Store
WO2021174882A1 (en) Data fragment verification method, apparatus, computer device, and readable storage medium
CN113542405A (en) Block chain-based network communication system, method, device and storage medium
CN111835871A (en) Method and device for transmitting data file and method and device for receiving data file
CN110209347B (en) Traceable data storage method
CN107395772B (en) Management method and management system for repeated data
CN112579591A (en) Data verification method and device, electronic equipment and computer readable storage medium
CN115883533A (en) File synchronization method and device, computer equipment and storage medium
CN115935414A (en) Block chain based data verification method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant