CN110989922A - Distributed data storage method and system - Google Patents

Distributed data storage method and system Download PDF

Info

Publication number
CN110989922A
CN110989922A CN201911033427.0A CN201911033427A CN110989922A CN 110989922 A CN110989922 A CN 110989922A CN 201911033427 A CN201911033427 A CN 201911033427A CN 110989922 A CN110989922 A CN 110989922A
Authority
CN
China
Prior art keywords
data
storage
data block
verification
marking
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911033427.0A
Other languages
Chinese (zh)
Other versions
CN110989922B (en
Inventor
李俊波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fiberhome Telecommunication Technologies Co Ltd
Original Assignee
Fiberhome Telecommunication Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fiberhome Telecommunication Technologies Co Ltd filed Critical Fiberhome Telecommunication Technologies Co Ltd
Priority to CN201911033427.0A priority Critical patent/CN110989922B/en
Publication of CN110989922A publication Critical patent/CN110989922A/en
Application granted granted Critical
Publication of CN110989922B publication Critical patent/CN110989922B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0614Improving the reliability of storage systems
    • G06F3/0619Improving the reliability of storage systems in relation to data integrity, e.g. data losses, bit errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1004Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's to protect a block of data words, e.g. CRC or checksum
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/062Securing storage systems
    • G06F3/0622Securing storage systems in relation to access
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/064Management of blocks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Computer Security & Cryptography (AREA)
  • Quality & Reliability (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Storage Device Security (AREA)

Abstract

The invention discloses a distributed data storage method and a system, which generate a data block distribution strategy corresponding to data storage identification one by one according to data storage size and a preset distributed algorithm by receiving a data storage request sent by a user side, generate first marking data corresponding to data blocks one by one according to a preset key generation algorithm, and respectively store the data blocks and the first marking data in the same storage positions of a plurality of storage nodes according to the data block distribution strategy; receiving a data reading request sent by a user side, acquiring a data block distribution strategy according to a data storage identifier, reading a data block of a storage node and first mark data, and performing data verification; if the verification is unsuccessful, reading the data block at the same storage position of the next storage node and the repeated data verification of the first marked data, and if the verification is successful, sending the read data block to the user side; and traversing all storage nodes in the data block distribution strategy according to the storage node reading sequence until the verification is successful, or else, sending the user data damage information to the user side.

Description

Distributed data storage method and system
Technical Field
The invention belongs to the field of distributed data storage, and particularly relates to a distributed data storage method and system.
Background
Distributed storage is a storage system, generally comprising a plurality of independent devices, and the devices are connected through a network. When data is written, the data blocks are uniformly distributed on a plurality of independent storage devices through an algorithm, when the data is read, the data blocks are read from different nodes to a client through the algorithm, and the system needs to meet the requirements of reliability, usability and expandability according to the application scene.
Generally, a client stores data by algorithmically addressing the data to a specific storage node through a network and then storing the data. Typically the data block size selects 64 KB. When the client reads data, the data is addressed to a specific storage node through an algorithm through a network, a 64KB data block is read and returned to the client, the integrity and the consistency of the data of the storage node are detected and checked through specific process polling, however, the check is time interval, the 64KB data block has no check process from the storage node to the client, and when the data block is tampered in the middle of the detection of the interval, the read data block has the probability of error data.
Disclosure of Invention
In order to overcome the defects or the improvement requirements in the prior art, the invention provides a distributed data storage method and a distributed data storage system, wherein a data block distribution strategy is generated by receiving a data storage request sent by a user side, and a data block and first marking data are respectively stored in the same storage position of a plurality of storage nodes according to the data block distribution strategy; receiving a data reading request sent by a user side, reading a data block and first mark data of a storage node according to a data block distribution strategy to perform data verification, reading repeated data of a next storage node to perform data verification if the verification is unsuccessful, and sending the read data block to the user side if the verification is successful, so that the same data block can be searched for by polling nodes according to a verification mechanism under the condition that the data block of the storage node has an error, and finally, correct data of a user is returned.
To achieve the above object, according to an aspect of the present invention, there is provided a distributed data storage method, including the steps of:
s1, receiving a data storage request sent by a user side, wherein the data storage request comprises a data storage size and a data storage identifier, and generating data block distribution strategies in one-to-one correspondence with the data storage identifier according to the data storage size and a preset distributed algorithm, wherein the data block distribution strategies comprise data block storage positions, storage node information and a storage node reading sequence; receiving a data stream sent by a user side, dividing the data stream into a plurality of data blocks according to the size of a preset data block, generating first marking data corresponding to the data blocks one by one according to a preset key generation algorithm, and respectively storing the data blocks and the first marking data in the same storage positions of a plurality of storage nodes according to a data block distribution strategy;
s2, receiving a data reading request sent by a user side, wherein the data reading request comprises a data storage identifier, and acquiring a data block distribution strategy according to the data storage identifier; reading the data block of the storage node and the first mark data according to the data block distribution strategy to perform data verification; if the verification is unsuccessful, reading the data block at the same storage position of the next storage node and the repeated data verification of the first marked data, and if the verification is successful, sending the read data block to the user side; and traversing all storage nodes in the data block distribution strategy according to the storage node reading sequence until the verification is successful, or else, sending the user data damage information to the user side.
As a further improvement of the invention, the preset distributed algorithm comprises consistent hash, DHT and CRUSH algorithms.
As a further improvement of the invention, the data stream sent by the user terminal is received through storage system protocols, wherein the storage system protocols comprise TCP, UDP and HTTP protocols.
As a further improvement of the present invention, the preset key generation algorithms include MD5, SHA and HMAC algorithms.
As a further improvement of the present invention, the data verification specifically comprises: and generating second marking data corresponding to the read data blocks one to one according to a preset key generation algorithm, and comparing and checking the first marking data and the second marking data, wherein if the first marking data and the second marking data are the same, the checking is successful.
To achieve the above object, according to another aspect of the present invention, there is provided a distributed data storage system, which includes an interaction module and a plurality of storage nodes,
the interactive module is used for receiving a data storage request sent by a user side, wherein the data storage request comprises a data storage size and a data storage identifier; generating a data block distribution strategy corresponding to the data storage identification one by one according to the data storage size and a preset distribution algorithm, wherein the data block distribution strategy comprises a data block storage position, storage node information and a storage node reading sequence; receiving a data stream sent by a user side, dividing the data stream into a plurality of data blocks according to the size of a preset data block, generating first marking data corresponding to the data blocks one by one according to a preset key generation algorithm, and respectively storing the data blocks and the first marking data in the same storage positions of a plurality of storage nodes according to the data block distribution strategy;
the interaction module is also used for receiving a data reading request sent by a user side, wherein the data reading request comprises a data storage identifier, and a data block distribution strategy is obtained according to the data storage identifier; reading the data block of the storage node and the first mark data according to the data block distribution strategy to perform data verification; if the verification is unsuccessful, reading the data block at the same storage position of the next storage node and the repeated data verification of the first marked data, and if the verification is successful, sending the read data block to the user side; and traversing all storage nodes in the data block distribution strategy according to the storage node reading sequence until the verification is successful, or else, sending the user data damage information to the user side.
As a further improvement of the invention, the preset distributed algorithm comprises consistent hash, DHT and CRUSH algorithms.
As a further improvement of the invention, the data stream sent by the user terminal is received through storage system protocols, wherein the storage system protocols comprise TCP, UDP and HTTP protocols.
As a further improvement of the present invention, the preset key generation algorithms include MD5, SHA and HMAC algorithms.
As a further improvement of the present invention, the data verification specifically comprises: and generating second marking data corresponding to the read data blocks one to one according to a preset key generation algorithm, and comparing and checking the first marking data and the second marking data, wherein if the first marking data and the second marking data are the same, the checking is successful.
Generally, compared with the prior art, the above technical solution conceived by the present invention has the following beneficial effects:
the invention relates to a distributed data storage method and a system, which generate a data block distribution strategy by receiving a data storage request sent by a user side, and respectively store a data block and first marking data in the same storage position of a plurality of storage nodes according to the data block distribution strategy; the method comprises the steps of receiving a data reading request sent by a user side, reading a data block and first mark data of a storage node according to a data block distribution strategy to carry out data verification, reading repeated data of a next storage node to carry out data verification if the verification is unsuccessful, and sending the read data block to the user side if the verification is successful, so that the same data block can be searched by polling nodes according to a verification mechanism under the condition that the data block has errors in the storage node, and finally, correct data of a user is returned, and the problem that the error data block is obtained with small probability when the data based on a storage system is read by the client side is solved.
According to the distributed data storage method and system, the first marking data of the stored data block are processed through the high-performance encryption algorithm, the second marking data of the read data block are processed through the high-performance encryption algorithm, and data verification is performed by comparing the first marking data with the second marking data, so that the reliability of reading of the data block is further ensured.
Drawings
Fig. 1 is a schematic flowchart of a distributed data storage method according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other. The present invention will be described in further detail with reference to specific embodiments.
Fig. 1 is a schematic flowchart of a distributed data storage method according to an embodiment of the present invention. As shown in fig. 1, a distributed data storage method includes the following steps:
s1, receiving a data storage request sent by a user side, wherein the data storage request comprises a data storage size and a data storage identifier, and generating data block distribution strategies in one-to-one correspondence with the data storage identifier according to the data storage size and a preset distributed algorithm, wherein the data block distribution strategies comprise data block storage positions, storage node information and a storage node reading sequence; receiving a data stream sent by a user side, dividing the data stream into a plurality of data blocks according to the size of a preset data block, generating first marking data corresponding to the data blocks one by one according to a preset key generation algorithm, and respectively storing the data blocks and the first marking data in the same storage positions of a plurality of storage nodes according to a data block distribution strategy;
as a preferred embodiment, the distributed data storage system may define a preset distributed algorithm, and the preset distributed algorithm may uniformly distribute data blocks of the user on the distributed storage nodes, and at the same time, when nodes are newly added and nodes are reduced, the data blocks on the nodes can be uniformly distributed; common preset distributed algorithms comprise consistent hash, DHT and CRUSH algorithms, and the distribution information of the user data blocks is calculated according to the preset distributed algorithms, so that data block distribution strategies corresponding to the data storage identifications one to one are obtained;
receiving data stream sent by a user end through a storage system protocol, wherein the common storage system protocol is TCP, UDP, HTTP and the like, the user data is transmitted into the distributed data storage system by the storage system protocol being partitioned into data blocks (as an example, the data blocks are 64KB in size), the distributed data storage system generates corresponding first marker data for each data block of the user according to a preset key generation algorithm, which includes, as an example, MD5, SHA and HMAC algorithms, and is, as an example, a one-way encryption and irreversible algorithm, the user may customize an encryption key, encrypting user data by a self-defined secret key, wherein the encryption result is data with fixed length such as 16 bits, 32 bits, 64 bits and the like, and respectively storing the data block and the first mark data in the same storage position of a plurality of storage nodes according to a data block distribution strategy;
s2, receiving a data reading request sent by a user side, wherein the data reading request comprises a data storage identifier, and searching a data block distribution strategy according to the data storage identifier; reading the data block of the storage node and the first mark data according to the data block distribution strategy to perform data verification; if the verification is unsuccessful, reading the data block at the same storage position of the next storage node and the repeated data verification of the first marked data, and if the verification is successful, sending the read data block to the user side; and traversing all storage nodes in the data block distribution strategy according to the storage node reading sequence until the verification is successful, or else, sending the user data damage information to the user side.
As a preferred embodiment, the data verification specifically includes: and generating second marking data corresponding to the read data blocks one to one according to a preset key generation algorithm, and comparing and checking the first marking data and the second marking data, wherein if the first marking data and the second marking data are the same, the checking is successful.
A user sends a data reading request, the distributed data storage system receives the request, the distributed data storage system locates a specific storage node according to a data block distribution strategy corresponding to a data storage identifier one by one, the storage node reads data blocks of a disk, each time the data is read from the disk as data blocks and first marking data (as an example, the size of the data blocks can be 64KB +16 bytes), the data is simultaneously divided into data blocks (64KB) and first marking data (16 bytes), corresponding second marking data (16 bytes) is generated for the data blocks (64KB) and compared with the first marking data (16 bytes), if the comparison is the same, the data blocks (64KB) are sent to a user terminal, and if the comparison is not the same, the data blocks are distributed according to the data block distribution strategy (including storage node information and storage node reading sequence), reselecting the next node, acquiring the data block and the first marked data at the same storage position of the next storage node, checking the data block in the same way, and repeating the steps until the data block at the same position circulates all the storage nodes and still has no correct data, and returning the user data damage information;
as an example, in the case of copy storage, the number of added set copies is 3, the user data a includes data blocks 1 to 5, which are stored on the storage nodes 1 to 3, respectively, and the data block 2 of the storage node 1 is damaged, the distributed data storage system selects one of the storage nodes according to a data reading request and a data block distribution policy sent by a user, if the storage node 1 is used, the data block to be read is the data block 1, when the data block 2 is read, it is verified that there is a problem in the data block, the data block 2 is read on the storage node 2, and then the remaining data blocks are read, if the problem of data block inconsistency occurs again, the storage node 3 is switched until the data is completely read, so that in the case that there is an error in the data block of the storage node according to the existing storage policy, data block encryption technology and data distribution algorithm, the same data block can be searched according to a checking mechanism and the polling node, and finally, the correct data of the user is returned. If the data bad block reaches the maximum limit, the data of the user is damaged, otherwise, when the data bad block of the user fails to reach the maximum limit, the user can still obtain reliable data by the method.
A distributed data storage system comprises an interaction module and a plurality of storage nodes, wherein,
the interactive module is used for receiving a data storage request sent by a user side, wherein the data storage request comprises a data storage size and a data storage identifier; generating a data block distribution strategy corresponding to the data storage identification one by one according to the data storage size and a preset distribution algorithm, wherein the data block distribution strategy comprises a data block storage position, storage node information and a storage node reading sequence; receiving a data stream sent by a user side, dividing the data stream into a plurality of data blocks according to the size of a preset data block, generating first marking data corresponding to the data blocks one by one according to a preset key generation algorithm, and respectively storing the data blocks and the first marking data in the same storage positions of a plurality of storage nodes according to a data block distribution strategy;
as a preferred embodiment, the distributed data storage system may define a preset distributed algorithm, and the preset distributed algorithm may uniformly distribute data blocks of the user on the distributed storage nodes, and at the same time, when nodes are newly added and nodes are reduced, the data blocks on the nodes can be uniformly distributed; common preset distributed algorithms comprise consistent hash, DHT and CRUSH algorithms, and the distribution information of the user data blocks is calculated according to the preset distributed algorithms, so that data block distribution strategies corresponding to the data storage identifications one to one are obtained;
receiving data stream sent by a user end through a storage system protocol, wherein the common storage system protocol is TCP, UDP, HTTP and the like, the user data is transmitted into the distributed data storage system by the storage system protocol being partitioned into data blocks (as an example, the data blocks are 64KB in size), the distributed data storage system generates corresponding first marker data for each data block of the user according to a preset key generation algorithm, which includes, as an example, MD5, SHA and HMAC algorithms, and is, as an example, a one-way encryption and irreversible algorithm, the user may customize an encryption key, encrypting user data by a self-defined secret key, wherein the encryption result is data with fixed length such as 16 bits, 32 bits, 64 bits and the like, and respectively storing the data block and the first mark data in the same storage position of a plurality of storage nodes according to a data block distribution strategy;
the interaction module is also used for receiving a data reading request sent by a user side, wherein the data reading request comprises a data storage identifier, and a data block distribution strategy is searched according to the data storage identifier; acquiring a data block of a storage node and first marking data according to a data block distribution strategy to perform data verification; if the verification is unsuccessful, reading the data block at the same storage position of the next storage node and the repeated data verification of the first marked data, and if the verification is successful, sending the read data block to the user side; and traversing all storage nodes in the data block distribution strategy according to the storage node reading sequence until the verification is successful, or else, sending the user data damage information to the user side.
As a preferred embodiment, the data verification specifically includes: and generating second marking data corresponding to the read data blocks one to one according to a preset key generation algorithm, and comparing and checking the first marking data and the second marking data, wherein if the first marking data and the second marking data are the same, the checking is successful.
A user sends a data reading request, the distributed data storage system receives the request, the distributed data storage system locates a specific storage node according to a data block distribution strategy corresponding to a data storage identifier one by one, the storage node reads data blocks of a disk, each time the data is read from the disk as data of the data block and first marking data (as an example, the size of the data block can be 64KB +16 bytes), the data is simultaneously divided into the data block (64KB) and the first marking data (16 bytes), corresponding second marking data (16 bytes) is generated for the data block (64KB) and compared with the first marking data (16 bytes), if the comparison is the same, the data block (64KB) is sent to the user end, if the comparison is not the same, according to the data block distribution strategy (including storage node information and storage node reading sequence), reselecting the next node, acquiring the data block and the first marked data at the same storage position of the next storage node, checking the data block in the same way, and repeating the steps until the data block at the same position circulates all the storage nodes and still has no correct data, and returning the user data damage information;
as an example, in the case of copy storage, the number of added set copies is 3, the user data a includes data blocks 1 to 5, which are stored on the storage nodes 1 to 3, respectively, and the data block 2 of the storage node 1 is damaged, the distributed data storage system selects one of the storage nodes according to a data reading request and a data block distribution policy sent by a user, if the storage node 1 is used, the data block to be read is the data block 1, when the data block 2 is read, it is verified that there is a problem in the data block, the data block 2 is read on the storage node 2, and then the remaining data blocks are read, if the problem of data block inconsistency occurs again, the storage node 3 is switched until the data is completely read, so that in the case that there is an error in the data block of the storage node according to the existing storage policy, data block encryption technology and data distribution algorithm, the same data block can be searched according to a checking mechanism and the polling node, and finally, the correct data of the user is returned. If the data bad block reaches the maximum limit, the data of the user is damaged, otherwise, when the data bad block of the user fails to reach the maximum limit, the user can still obtain reliable data by the method.
It will be understood by those skilled in the art that the foregoing is only a preferred embodiment of the present invention, and is not intended to limit the invention, and that any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (10)

1. A distributed data storage method, comprising the steps of:
s1, receiving a data storage request sent by a user side, wherein the data storage request comprises a data storage size and a data storage identifier, and generating data block distribution strategies in one-to-one correspondence with the data storage identifier according to the data storage size and a preset distributed algorithm, wherein the data block distribution strategies comprise data block storage positions, storage node information and a storage node reading sequence; receiving a data stream sent by a user side, dividing the data stream into a plurality of data blocks according to the size of a preset data block, generating first marking data corresponding to the data blocks one by one according to a preset key generation algorithm, and respectively storing the data blocks and the first marking data in the same storage positions of a plurality of storage nodes according to the data block distribution strategy;
s2, receiving a data reading request sent by a user side, wherein the data reading request comprises a data storage identifier, and acquiring a data block distribution strategy according to the data storage identifier; reading the data block of the storage node and the first mark data according to the data block distribution strategy to perform data verification; if the verification is unsuccessful, reading the data block at the same storage position of the next storage node and the repeated data verification of the first marked data, and if the verification is successful, sending the read data block to the user side; and traversing all storage nodes in the data block distribution strategy according to the storage node reading sequence until the verification is successful, or else, sending the user data damage information to the user side.
2. The distributed data storage method according to claim 1, wherein the preset distributed algorithms include consistent hash, DHT, and CRUSH algorithms.
3. A distributed data storage method according to claim 1 or 2, wherein the data stream sent by the user terminal is received through storage system protocols, and the storage system protocols include TCP, UDP and HTTP protocols.
4. A distributed data storage method according to claim 1 or 2, wherein said pre-provisioned key generation algorithms comprise MD5, SHA and HMAC algorithms.
5. The distributed data storage method according to claim 1 or 2, wherein the data verification specifically comprises: and generating second marking data corresponding to the read data blocks one to one according to a preset key generation algorithm, and comparing and checking the first marking data and the second marking data, wherein if the first marking data and the second marking data are the same, the checking is successful.
6. A distributed data storage system comprising an interaction module and a plurality of storage nodes,
the interactive module is used for receiving a data storage request sent by a user side, wherein the data storage request comprises a data storage size and a data storage identifier; generating a data block distribution strategy corresponding to the data storage identification one by one according to the data storage size and a preset distribution algorithm, wherein the data block distribution strategy comprises a data block storage position, storage node information and a storage node reading sequence; receiving a data stream sent by a user side, dividing the data stream into a plurality of data blocks according to the size of a preset data block, generating first marking data corresponding to the data blocks one by one according to a preset key generation algorithm, and respectively storing the data blocks and the first marking data in the same storage positions of a plurality of storage nodes according to the data block distribution strategy;
the interactive module is further used for receiving a data reading request sent by a user side, wherein the data reading request comprises a data storage identifier, and a data block distribution strategy is obtained according to the data storage identifier; reading the data block of the storage node and the first mark data according to the data block distribution strategy to perform data verification; if the verification is unsuccessful, reading the data block at the same storage position of the next storage node and the repeated data verification of the first marked data, and if the verification is successful, sending the read data block to the user side; and traversing all storage nodes in the data block distribution strategy according to the storage node reading sequence until the verification is successful, or else, sending the user data damage information to the user side.
7. The distributed data storage system of claim 6, wherein said predetermined distributed algorithms comprise consistent hash, DHT, and CRUSH algorithms.
8. A distributed data storage system according to claim 6 or 7, wherein the data stream sent by the user terminal is received by the storage system protocol, and the storage system protocol includes TCP, UDP and HTTP protocols.
9. A distributed data storage system as claimed in claim 6 or 7, wherein said pre-provisioned key generation algorithms include MD5, SHA and HMAC algorithms.
10. The distributed data storage system according to claim 6 or 7, wherein the data verification specifically is: and generating second marking data corresponding to the read data blocks one to one according to a preset key generation algorithm, and comparing and checking the first marking data and the second marking data, wherein if the first marking data and the second marking data are the same, the checking is successful.
CN201911033427.0A 2019-10-28 2019-10-28 Distributed data storage method and system Active CN110989922B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911033427.0A CN110989922B (en) 2019-10-28 2019-10-28 Distributed data storage method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911033427.0A CN110989922B (en) 2019-10-28 2019-10-28 Distributed data storage method and system

Publications (2)

Publication Number Publication Date
CN110989922A true CN110989922A (en) 2020-04-10
CN110989922B CN110989922B (en) 2023-05-26

Family

ID=70082489

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911033427.0A Active CN110989922B (en) 2019-10-28 2019-10-28 Distributed data storage method and system

Country Status (1)

Country Link
CN (1) CN110989922B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112307064A (en) * 2020-10-29 2021-02-02 上海达梦数据库有限公司 Data management system, method and storage medium
CN113220237A (en) * 2021-05-17 2021-08-06 北京青云科技股份有限公司 Distributed storage method, device, equipment and storage medium
CN113536356A (en) * 2021-07-30 2021-10-22 海宁奕斯伟集成电路设计有限公司 Data verification method and distributed storage system
CN113625972A (en) * 2021-08-26 2021-11-09 上海应用技术大学 Hierarchical data possession proving method capable of realizing public auditing
CN113794558A (en) * 2021-09-16 2021-12-14 烽火通信科技股份有限公司 L-tree calculation method, device and system in XMSS algorithm
CN114138191A (en) * 2021-11-21 2022-03-04 苏州浪潮智能科技有限公司 Storage pool data verification method, system, device and medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140122429A1 (en) * 2012-10-31 2014-05-01 International Business Machines Corporation Data processing method and apparatus for distributed systems
CN105630808A (en) * 2014-10-31 2016-06-01 北京奇虎科技有限公司 Distributed file system based file reading and writing method and node server
CN107807792A (en) * 2017-10-27 2018-03-16 郑州云海信息技术有限公司 A kind of data processing method and relevant apparatus based on copy storage system
CN107844388A (en) * 2012-11-26 2018-03-27 亚马逊科技公司 Recover database from standby system streaming
CN110163009A (en) * 2019-05-23 2019-08-23 北京交通大学 The method and apparatus of the safety check and reparation of HDFS storage platform

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140122429A1 (en) * 2012-10-31 2014-05-01 International Business Machines Corporation Data processing method and apparatus for distributed systems
CN107844388A (en) * 2012-11-26 2018-03-27 亚马逊科技公司 Recover database from standby system streaming
CN105630808A (en) * 2014-10-31 2016-06-01 北京奇虎科技有限公司 Distributed file system based file reading and writing method and node server
CN107807792A (en) * 2017-10-27 2018-03-16 郑州云海信息技术有限公司 A kind of data processing method and relevant apparatus based on copy storage system
CN110163009A (en) * 2019-05-23 2019-08-23 北京交通大学 The method and apparatus of the safety check and reparation of HDFS storage platform

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112307064A (en) * 2020-10-29 2021-02-02 上海达梦数据库有限公司 Data management system, method and storage medium
CN113220237A (en) * 2021-05-17 2021-08-06 北京青云科技股份有限公司 Distributed storage method, device, equipment and storage medium
CN113536356A (en) * 2021-07-30 2021-10-22 海宁奕斯伟集成电路设计有限公司 Data verification method and distributed storage system
CN113625972A (en) * 2021-08-26 2021-11-09 上海应用技术大学 Hierarchical data possession proving method capable of realizing public auditing
CN113794558A (en) * 2021-09-16 2021-12-14 烽火通信科技股份有限公司 L-tree calculation method, device and system in XMSS algorithm
CN113794558B (en) * 2021-09-16 2024-02-27 烽火通信科技股份有限公司 L-tree calculation method, device and system in XMS algorithm
CN114138191A (en) * 2021-11-21 2022-03-04 苏州浪潮智能科技有限公司 Storage pool data verification method, system, device and medium
CN114138191B (en) * 2021-11-21 2023-09-01 苏州浪潮智能科技有限公司 Storage pool data verification method, system, equipment and medium

Also Published As

Publication number Publication date
CN110989922B (en) 2023-05-26

Similar Documents

Publication Publication Date Title
CN110989922A (en) Distributed data storage method and system
US9332430B2 (en) Method of identifying and authenticating a radio tag by a reader
WO2019161774A1 (en) Methods, application server, block chain node and media for logistics tracking and source tracing
CN110597824A (en) Data storage method and device based on block chain network
CN109492049B (en) Data processing, block generation and synchronization method for block chain network
CN101547184A (en) Method and device for authenticating data block transmitted in network
CN115225409B (en) Cloud data safety duplicate removal method based on multi-backup joint verification
CN111367923A (en) Data processing method, data processing device, node equipment and storage medium
CN112839003A (en) Data verification method and system
CN113055176A (en) Terminal authentication method and system, terminal device, P2P verification platform and medium
CN110619022B (en) Node detection method, device, equipment and storage medium based on block chain network
CN113961908B (en) Data storage method and device, computer equipment and storage medium
WO2021196463A1 (en) Blockchain data synchronization method and apparatus, and electronic device and storage medium
CN103888424A (en) Cluster-type data encryption system and data processing method thereof
CN110737725A (en) Electronic information inspection method, device, equipment, medium and system
CN108133026B (en) Multi-data processing method, system and storage medium
CN111526165B (en) Consensus method and system in alliance chain
CN116866031A (en) Industrial Internet data transmission processing method and system
CN114143098B (en) Data storage method and data storage device
CN112637855B (en) Machine-card binding method based on block chain and server
CN115392927A (en) Data tracing system and data tracing method based on block chain
CN111884818A (en) Data file processing method, system, server and storage medium
CN111241005A (en) Key value pair-based safe partition storage method and system
CN112836228B (en) Distributed management system of data ownership based on block chain
CN117349860B (en) File storage system and method based on matrix change and data segmentation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant