CN111008181A - Method, system, terminal and storage medium for switching storage strategies of distributed file system - Google Patents

Method, system, terminal and storage medium for switching storage strategies of distributed file system Download PDF

Info

Publication number
CN111008181A
CN111008181A CN201911048544.4A CN201911048544A CN111008181A CN 111008181 A CN111008181 A CN 111008181A CN 201911048544 A CN201911048544 A CN 201911048544A CN 111008181 A CN111008181 A CN 111008181A
Authority
CN
China
Prior art keywords
data
switching
storage
storage strategy
file system
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911048544.4A
Other languages
Chinese (zh)
Inventor
张东东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Inspur Intelligent Technology Co Ltd
Original Assignee
Suzhou Inspur Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Inspur Intelligent Technology Co Ltd filed Critical Suzhou Inspur Intelligent Technology Co Ltd
Priority to CN201911048544.4A priority Critical patent/CN111008181A/en
Publication of CN111008181A publication Critical patent/CN111008181A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/172Caching, prefetching or hoarding of files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1076Parity data used in redundant arrays of independent storages, e.g. in RAID systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/185Hierarchical storage management [HSM] systems, e.g. file migration or policies thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0604Improving or facilitating administration, e.g. storage management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0655Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
    • G06F3/0656Data buffering arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M13/00Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes
    • H03M13/03Error detection or forward error correction by redundancy in data representation, i.e. code words containing more digits than the source words
    • H03M13/05Error detection or forward error correction by redundancy in data representation, i.e. code words containing more digits than the source words using block codes, i.e. a predetermined number of check bits joined to a predetermined number of information bits
    • H03M13/13Linear codes
    • H03M13/15Cyclic codes, i.e. cyclic shifts of codewords produce other codewords, e.g. codes defined by a generator polynomial, Bose-Chaudhuri-Hocquenghem [BCH] codes
    • H03M13/151Cyclic codes, i.e. cyclic shifts of codewords produce other codewords, e.g. codes defined by a generator polynomial, Bose-Chaudhuri-Hocquenghem [BCH] codes using error location or error correction polynomials
    • H03M13/154Error and erasure correction, e.g. by using the error and erasure locator or Forney polynomial

Abstract

The invention provides a method, a system, a terminal and a storage medium for switching storage strategies of a distributed file system, wherein the method comprises the following steps: setting a storage strategy of a distributed file system into a copy storage strategy and an erasure code storage strategy, and setting one of the copy storage strategy and the erasure code storage strategy as an initial storage strategy; setting storage strategy switching conditions; and if the data blocks in the distributed file system accord with the switching conditions, executing storage strategy switching on the data blocks. The invention can realize the automatic switching of two storage strategies, thereby adapting to the dynamic scene of cold and hot data change and further improving the cluster storage efficiency.

Description

Method, system, terminal and storage medium for switching storage strategies of distributed file system
Technical Field
The invention relates to the technical field of distributed file system storage management, in particular to a method, a system, a terminal and a storage medium for switching storage strategies of a distributed file system.
Background
The Hadoop Distributed File System (HDFS) is a distributed file system designed to run on commercial hardware. The HDFS opens file namespaces to the outside and allows user data to be stored in a file form, and an internal mechanism is to divide a file into one or more blocks, wherein each block has a plurality of backup blocks (default three copies namely 2) based on an HDFS default copy storage strategy, and the blocks are stored in a group of data nodes. The current storage mode in the HDFS is: hot, close, Warm, All _ SSD, One _ SSD, Lazy _ persistent, Provided to store data of different access heat types, the storage policy is a default three-copy storage, i.e. 3T space is needed to store 1T data, this scheme has an overhead of 200% in storage space and other resources (e.g. network bandwidth), for Warm and cold data sets with relatively low I/O activity, other block copies are rarely accessed during normal operation, but still consume the same amount of resources as the first copy.
An Erasure coding technology, abbreviated as EC, is a data protection technology, and is used for data recovery in data transmission in the communication industry at first, and it is a coding fault-tolerant technology, and adds new check data into original data to make data of each part generate relevance, and under the condition of data error in a certain range, recovery can be performed through Erasure coding technology. The erasure code function is implemented in HDFS-7285 and is distributed in hadoop version 3.0.0, with the default closed state.
In the scene of a real-time monitoring system, only business data in the latest time period is usually concerned, the business data can be data in the latest 30 minutes, the latest 1 hour or even the latest several hours according to business requirements, the concerned data can be called as hot data, and when the hot data loses timeliness, the concerned data is called as cold data. Over time, hot data may become cold data and the newly received data becomes hot data. According to the characteristics of hot data and cold data, the optimal storage strategy of the cold data is erasure code storage, and the optimal storage strategy of the hot data is three-copy storage.
In the prior art, a technical scheme that a traditional copy storage strategy and an erasure code storage strategy are converted to improve cluster storage efficiency in a process that data are changed from hot data to cold data is not mentioned, so that the invention provides a technical scheme of an HDFS (Hadoop distributed File System) based copy storage strategy and erasure code storage strategy conversion algorithm system to adapt to an optimal storage strategy of cold and hot data change.
Disclosure of Invention
In view of the above-mentioned deficiencies of the prior art, the present invention provides a method, a system, a terminal and a storage medium for switching storage policies of a distributed file system, so as to solve the above-mentioned technical problems.
In a first aspect, the present invention provides a method for switching storage policies of a distributed file system, including:
setting a storage strategy of a distributed file system into a copy storage strategy and an erasure code storage strategy, and setting one of the copy storage strategy and the erasure code storage strategy as an initial storage strategy;
setting storage strategy switching conditions;
and if the data blocks in the distributed file system accord with the switching conditions, executing storage strategy switching on the data blocks.
Further, the setting of the storage policy switching condition includes:
setting a switching time threshold;
and if the existence time of the data block reaches the switching time threshold, the storage strategy switching condition is reached.
Further, the performing storage policy switching on the data block if the data block in the distributed file system meets the switching condition includes:
acquiring data block information of a storage strategy to be switched;
selecting a data node position from a node list where the data block is located;
reading data of a data block corresponding to the data node position, putting the read data into a buffer area, opening an interface for an HDFS (Hadoop distributed File System) cluster verification program DataBlockScanner, and generating a verification ack;
coding the data in the buffer area, and writing the coded data into the selected data node;
and deleting redundant blocks stored in the copy.
In a second aspect, the present invention provides a distributed file system storage policy switching system, including:
the strategy setting unit is configured for setting a storage strategy of the distributed file system into a copy storage strategy and an erasure code storage strategy and setting one of the copy storage strategy and the erasure code storage strategy as an initial storage strategy;
a condition setting unit configured to set a storage policy switching condition;
and the switching execution unit is configured to execute storage strategy switching on the data blocks in the distributed file system if the data blocks meet the switching conditions.
Further, the condition setting unit includes:
a threshold setting module configured to set a switching time threshold;
and the rule setting module is configured to reach the storage strategy switching condition if the existence time of the data block reaches the switching time threshold.
Further, the handover performing unit includes:
the information acquisition module is used for acquiring the data block information of the storage strategy to be switched;
the position selection module is configured to select the data node position from the node list where the data block is located;
the data reading module is configured to read data of a data block corresponding to the data node position, place the read data into a buffer area, open an interface for an HDFS (Hadoop distributed File System) cluster verification program DataBlockscanner, and generate verification ack;
the data writing module is configured to encode the data in the buffer area and write the encoded data into the selected data node;
and the storage deleting module is configured to delete redundant blocks stored in the copy.
In a third aspect, a terminal is provided, including:
a processor, a memory, wherein,
the memory is used for storing a computer program which,
the processor is used for calling and running the computer program from the memory so as to make the terminal execute the method of the terminal.
In a fourth aspect, a computer storage medium is provided having stored therein instructions that, when executed on a computer, cause the computer to perform the method of the above aspects.
The beneficial effect of the invention is that,
according to the storage strategy switching method, the system, the terminal and the storage medium of the distributed file system, two storage strategies, namely a copy storage strategy and an erasure code storage strategy, are arranged in the distributed file system, and one storage strategy is selected as an initial storage strategy according to needs, namely the storage strategy used when the data blocks are stored. And then setting a storage strategy switching condition, and switching the storage strategy of the data block which meets the switching condition. The invention can realize the automatic switching of two storage strategies, thereby adapting to the dynamic scene of cold and hot data change and further improving the cluster storage efficiency.
In addition, the invention has reliable design principle, simple structure and very wide application prospect.
Drawings
In order to more clearly illustrate the embodiments or technical solutions in the prior art of the present invention, the drawings used in the description of the embodiments or prior art will be briefly described below, and it is obvious for those skilled in the art that other drawings can be obtained based on these drawings without creative efforts.
FIG. 1 is a schematic flow diagram of a method of one embodiment of the invention.
FIG. 2 is a schematic block diagram of a system of one embodiment of the present invention.
Fig. 3 is a schematic structural diagram of a terminal according to an embodiment of the present invention.
Detailed Description
In order to make those skilled in the art better understand the technical solution of the present invention, the technical solution in the embodiment of the present invention will be clearly and completely described below with reference to the drawings in the embodiment of the present invention, and it is obvious that the described embodiment is only a part of the embodiment of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
FIG. 1 is a schematic flow diagram of a method of one embodiment of the invention. The execution subject in fig. 1 may be a distributed file system storage policy switching system.
As shown in fig. 1, the method 100 includes:
step 110, setting a storage strategy of the distributed file system as a copy storage strategy and an erasure code storage strategy, and setting one of the copy storage strategy and the erasure code storage strategy as an initial storage strategy;
step 120, setting storage strategy switching conditions;
and step 130, if the data blocks in the distributed file system meet the switching conditions, performing storage policy switching on the data blocks.
Optionally, as an embodiment of the present invention, the setting of the storage policy switching condition includes:
setting a switching time threshold;
and if the existence time of the data block reaches the switching time threshold, the storage strategy switching condition is reached.
Optionally, as an embodiment of the present invention, the performing storage policy switching on a data block in a distributed file system if the data block meets the switching condition includes:
acquiring data block information of a storage strategy to be switched;
selecting a data node position from a node list where the data block is located;
reading data of a data block corresponding to the data node position, putting the read data into a buffer area, opening an interface for an HDFS (Hadoop distributed File System) cluster verification program DataBlockScanner, and generating a verification ack;
coding the data in the buffer area, and writing the coded data into the selected data node;
and deleting redundant blocks stored in the copy.
In order to facilitate understanding of the present invention, the principle of the distributed file system storage policy switching method of the present invention is combined with the process of performing switching management on the distributed file system storage policy in the embodiment, and the distributed file system storage policy switching method provided by the present invention is further described below.
Specifically, the method for switching the storage policy of the distributed file system includes:
s1, setting the storage strategy of the distributed file system as a copy storage strategy and an erasure code storage strategy, and setting one of the storage strategies as an initial storage strategy.
And calling a setStoragePolicyChangeAction () method for setting the opening and closing of the storage strategy algorithm system. And setting the copy storage strategy and the erasure code storage strategy to be in an open state. In this embodiment, the copy storage policy is set as the initial storage policy.
And S2, setting storage strategy switching conditions.
In this embodiment, the existence time of the data block is used as a switching reference standard, that is, a time threshold is preset and set to 2h, and the existence time of the data block is used as a reference standard, and the existence time reaches 2h, that is, the switching condition is met.
In other embodiments of the present invention, the number of accesses to a data block may also be used as a reference standard, and if the number of accesses to a certain data block in a period of time is lower than a preset number threshold, the switching condition is met.
And S3, if the data blocks in the distributed file system meet the switching conditions, executing storage strategy switching on the data blocks.
In this embodiment, the data block that meets the switching condition is switched from the copy storage policy to the erasure code storage policy, and the specific steps are as follows:
(1) calling a getLockInfo () method to traverse the data block information of the storage strategy to be converted: the method comprises the following steps: blockpool ID, blockId, numBytes, generational stamp, Availability, and returning data block information in the form of an array;
(2) if m data nodes required by EC are known, firstly calling a selectBestNodesFromSourceBlockNodes () method to select n optimal data node positions from a node list where a data block is located, and if m-n is greater than 0, calling the selectBestNodesFromOtherNode () method to select the rest (m-n) nodes from other data nodes of the cluster; if the node has bad or slow node, calling readBandDataNodesFromSelect () method to read the number of bad and slow nodes, and calling selectBestNodeFromOtherNode () method again to reselect once;
(3) calling an addStrepidReaderForSelectdNuodes () method, building a corresponding stripReader for the data block position (data node) returned in each step (2), remotely reading, placing the stripReader into a buffer area, starting an interface for a DataBlockScanner of an HDFS cluster verification program, and generating a verification ack; after reading, calling IsStrepidReaderForSelectdSoesAll () to judge whether reading is finished or not and whether a new reading is started or not is determined;
(4) invoking a recornstrTargets () method to encode the data in the buffer area;
(5) calling a transferData2Targets () method to write the 1.4 coded data into the selected Nodes;
(6) and calling a deleterredanddatablock () method to delete redundant blocks stored in the copy.
In another embodiment of the present invention, if the initial storage policy is an erasure code storage policy and the data block meeting the condition needs to be switched from the erasure code storage policy to the copy storage policy, the switching step is as follows:
(a) calling a getLockInfo () method to traverse the data block information of the storage strategy to be converted: the method comprises the following steps: blockpool ID, blockId, numBytes, generational stamp, Availability, and returning data block information in the form of an array;
(b) if a number of a data nodes required by copy storage is known, firstly, a selectBestNodesFromSourceBlockNodes () method is called to select b optimal data node positions from a node list where a data block is located, and if a-b is larger than 0, the selectBestNodesFromOtherNode () method is called to select the rest (a-b) nodes from other data nodes in the cluster; if the node has bad or slow node, calling readBandDataNodesFromSelect () method to read the number of bad and slow nodes, and calling selectBestNodeFromOtherNode () method again to reselect once;
(c) calling readstripblockdata () to start multithreading to simultaneously read data from the data nodes obtained in the step (b), putting the data into a buffer area, opening an interface for an HDFS cluster verification program DataBlockScanner, and generating a verification ack; after reading, calling IsStrepidReaderForSelectdSoesAll () to judge whether reading is finished or not and whether a new reading is started or not is determined;
(d) invoking a recornstrTargets () method to decode the data in the buffer area;
(e) calling a transferData2Targets () method to write the decoded data of 2.4 into the selected Nodes;
(f) and calling a deleterredanddatablock () method to delete redundant blocks stored in the copy.
As shown in fig. 2, the system 200 includes:
a policy setting unit 210 configured to set a storage policy of the distributed file system as a copy storage policy and an erasure code storage policy, and set one of the storage policies as an initial storage policy;
a condition setting unit 220 configured to set a storage policy switching condition;
and a switching execution unit 230 configured to execute storage policy switching on the data block in the distributed file system if the data block meets the switching condition.
Optionally, as an embodiment of the present invention, the condition setting unit includes:
a threshold setting module configured to set a switching time threshold;
and the rule setting module is configured to reach the storage strategy switching condition if the existence time of the data block reaches the switching time threshold.
Optionally, as an embodiment of the present invention, the handover performing unit includes:
the information acquisition module is used for acquiring the data block information of the storage strategy to be switched;
the position selection module is configured to select the data node position from the node list where the data block is located;
the data reading module is configured to read data of a data block corresponding to the data node position, place the read data into a buffer area, open an interface for an HDFS (Hadoop distributed File System) cluster verification program DataBlockscanner, and generate verification ack;
the data writing module is configured to encode the data in the buffer area and write the encoded data into the selected data node;
and the storage deleting module is configured to delete redundant blocks stored in the copy.
Fig. 3 is a schematic structural diagram of a terminal system 300 according to an embodiment of the present invention, where the terminal system 300 may be used to execute the method for switching the storage policy of the distributed file system according to the embodiment of the present invention.
The terminal system 300 may include: a processor 310, a memory 320, and a communication unit 330. The components communicate via one or more buses, and those skilled in the art will appreciate that the architecture of the servers shown in the figures is not intended to be limiting, and may be a bus architecture, a star architecture, a combination of more or less components than those shown, or a different arrangement of components.
The memory 320 may be used for storing instructions executed by the processor 310, and the memory 320 may be implemented by any type of volatile or non-volatile storage terminal or combination thereof, such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic disk or optical disk. The executable instructions in memory 320, when executed by processor 310, enable terminal 300 to perform some or all of the steps in the method embodiments described below.
The processor 310 is a control center of the storage terminal, connects various parts of the entire electronic terminal using various interfaces and lines, and performs various functions of the electronic terminal and/or processes data by operating or executing software programs and/or modules stored in the memory 320 and calling data stored in the memory. The processor may be composed of an Integrated Circuit (IC), for example, a single packaged IC, or a plurality of packaged ICs connected with the same or different functions. For example, the processor 310 may include only a Central Processing Unit (CPU). In the embodiment of the present invention, the CPU may be a single operation core, or may include multiple operation cores.
A communication unit 330, configured to establish a communication channel so that the storage terminal can communicate with other terminals. And receiving user data sent by other terminals or sending the user data to other terminals.
The present invention also provides a computer storage medium, wherein the computer storage medium may store a program, and the program may include some or all of the steps in the embodiments provided by the present invention when executed. The storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM) or a Random Access Memory (RAM).
Therefore, the invention sets two storage strategies, namely a copy storage strategy and an erasure code storage strategy, in the distributed file system, and selects one storage strategy as an initial storage strategy according to requirements, namely the storage strategy used when the data blocks are stored. And then setting a storage strategy switching condition, and switching the storage strategy of the data block which meets the switching condition. The invention can realize automatic switching of two storage strategies, thereby adapting to dynamic scenes of cold and hot data changes and further improving cluster storage efficiency.
Those skilled in the art will readily appreciate that the techniques of the embodiments of the present invention may be implemented as software plus a required general purpose hardware platform. Based on such understanding, the technical solutions in the embodiments of the present invention may be embodied in the form of a software product, where the computer software product is stored in a storage medium, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and the like, and the storage medium can store program codes, and includes instructions for enabling a computer terminal (which may be a personal computer, a server, or a second terminal, a network terminal, and the like) to perform all or part of the steps of the method in the embodiments of the present invention.
The same and similar parts in the various embodiments in this specification may be referred to each other. Especially, for the terminal embodiment, since it is basically similar to the method embodiment, the description is relatively simple, and the relevant points can be referred to the description in the method embodiment.
In the embodiments provided in the present invention, it should be understood that the disclosed system and method can be implemented in other ways. For example, the above-described system embodiments are merely illustrative, and for example, the division of the units is only one logical functional division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, systems or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
Although the present invention has been described in detail by referring to the drawings in connection with the preferred embodiments, the present invention is not limited thereto. Various equivalent modifications or substitutions can be made on the embodiments of the present invention by those skilled in the art without departing from the spirit and scope of the present invention, and these modifications or substitutions are within the scope of the present invention/any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (8)

1. A distributed file system storage strategy switching method is characterized by comprising the following steps:
setting a storage strategy of a distributed file system into a copy storage strategy and an erasure code storage strategy, and setting one of the copy storage strategy and the erasure code storage strategy as an initial storage strategy;
setting storage strategy switching conditions;
and if the data blocks in the distributed file system accord with the switching conditions, executing storage strategy switching on the data blocks.
2. The distributed file system storage policy switching method according to claim 1, wherein the setting of the storage policy switching condition includes:
setting a switching time threshold;
and if the existence time of the data block reaches the switching time threshold, the storage strategy switching condition is reached.
3. The method of claim 1, wherein performing storage policy switching on a data block in the distributed file system if the data block meets the switching condition comprises:
acquiring data block information of a storage strategy to be switched;
selecting a data node position from a node list where the data block is located;
reading data of a data block corresponding to the data node position, putting the read data into a buffer area, opening an interface for an HDFS (Hadoop distributed File System) cluster verification program DataBlockScanner, and generating a verification ack;
coding the data in the buffer area, and writing the coded data into the selected data node;
and deleting redundant blocks stored in the copy.
4. A distributed file system storage policy switching system, comprising:
the strategy setting unit is configured for setting a storage strategy of the distributed file system into a copy storage strategy and an erasure code storage strategy and setting one of the copy storage strategy and the erasure code storage strategy as an initial storage strategy;
a condition setting unit configured to set a storage policy switching condition;
and the switching execution unit is configured to execute storage strategy switching on the data blocks in the distributed file system if the data blocks meet the switching conditions.
5. The distributed file system storage policy switching system according to claim 4, wherein the condition setting unit comprises:
a threshold setting module configured to set a switching time threshold;
and the rule setting module is configured to reach the storage strategy switching condition if the existence time of the data block reaches the switching time threshold.
6. The distributed file system storage policy switching system according to claim 4, wherein the switching execution unit comprises:
the information acquisition module is used for acquiring the data block information of the storage strategy to be switched;
the position selection module is configured to select the data node position from the node list where the data block is located;
the data reading module is configured to read data of a data block corresponding to the data node position, place the read data into a buffer area, open an interface for an HDFS (Hadoop distributed File System) cluster verification program DataBlockscanner, and generate verification ack;
the data writing module is configured to encode the data in the buffer area and write the encoded data into the selected data node;
and the storage deleting module is configured to delete redundant blocks stored in the copy.
7. A terminal, comprising:
a processor;
a memory for storing instructions for execution by the processor;
wherein the processor is configured to perform the method of any one of claims 1-3.
8. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1-3.
CN201911048544.4A 2019-10-31 2019-10-31 Method, system, terminal and storage medium for switching storage strategies of distributed file system Pending CN111008181A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911048544.4A CN111008181A (en) 2019-10-31 2019-10-31 Method, system, terminal and storage medium for switching storage strategies of distributed file system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911048544.4A CN111008181A (en) 2019-10-31 2019-10-31 Method, system, terminal and storage medium for switching storage strategies of distributed file system

Publications (1)

Publication Number Publication Date
CN111008181A true CN111008181A (en) 2020-04-14

Family

ID=70111783

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911048544.4A Pending CN111008181A (en) 2019-10-31 2019-10-31 Method, system, terminal and storage medium for switching storage strategies of distributed file system

Country Status (1)

Country Link
CN (1) CN111008181A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111930555A (en) * 2020-09-02 2020-11-13 平安国际智慧城市科技股份有限公司 Erasure code based file processing method and device and computer equipment
CN113568580A (en) * 2021-07-29 2021-10-29 广州市品高软件股份有限公司 Method, device and medium for realizing distributed storage system and storage system
CN113886115A (en) * 2021-09-09 2022-01-04 上海智能网联汽车技术中心有限公司 Block chain Byzantine fault-tolerant method and system based on vehicle-road cooperation

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103118133A (en) * 2013-02-28 2013-05-22 浙江大学 Mixed cloud storage method based on file access frequency
CN105635252A (en) * 2015-12-23 2016-06-01 浪潮集团有限公司 Erasure code redundant backup strategy of Hadoop distributed file system (HDFS)
CN105677742A (en) * 2015-12-30 2016-06-15 深圳市瑞驰信息技术有限公司 Method and apparatus for storing files
CN105791353A (en) * 2014-12-23 2016-07-20 深圳市腾讯计算机系统有限公司 Distributed data storage method and system based on erasure code
CN106708653A (en) * 2016-12-29 2017-05-24 广州中国科学院软件应用技术研究所 Mixed tax administration data security protecting method based on erasure code and multi-copy

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103118133A (en) * 2013-02-28 2013-05-22 浙江大学 Mixed cloud storage method based on file access frequency
CN105791353A (en) * 2014-12-23 2016-07-20 深圳市腾讯计算机系统有限公司 Distributed data storage method and system based on erasure code
CN105635252A (en) * 2015-12-23 2016-06-01 浪潮集团有限公司 Erasure code redundant backup strategy of Hadoop distributed file system (HDFS)
CN105677742A (en) * 2015-12-30 2016-06-15 深圳市瑞驰信息技术有限公司 Method and apparatus for storing files
CN106708653A (en) * 2016-12-29 2017-05-24 广州中国科学院软件应用技术研究所 Mixed tax administration data security protecting method based on erasure code and multi-copy

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111930555A (en) * 2020-09-02 2020-11-13 平安国际智慧城市科技股份有限公司 Erasure code based file processing method and device and computer equipment
CN113568580A (en) * 2021-07-29 2021-10-29 广州市品高软件股份有限公司 Method, device and medium for realizing distributed storage system and storage system
CN113886115A (en) * 2021-09-09 2022-01-04 上海智能网联汽车技术中心有限公司 Block chain Byzantine fault-tolerant method and system based on vehicle-road cooperation
CN113886115B (en) * 2021-09-09 2024-02-20 上海智能网联汽车技术中心有限公司 Block chain Bayesian fault tolerance method and system based on vehicle-road cooperation

Similar Documents

Publication Publication Date Title
CN107943421B (en) Partition division method and device based on distributed storage system
CN111008181A (en) Method, system, terminal and storage medium for switching storage strategies of distributed file system
CN109783016A (en) A kind of elastic various dimensions redundancy approach in distributed memory system
CN106708653B (en) Mixed tax big data security protection method based on erasure code and multiple copies
US11620087B2 (en) Implicit leader election in a distributed storage network
CN111857592A (en) Data storage method and device based on object storage system and electronic equipment
CN106776795B (en) Data writing method and device based on Hbase database
CN111930305A (en) Data storage method and device, storage medium and electronic device
CN110162344A (en) A kind of method, apparatus, computer equipment and readable storage medium storing program for executing that current limliting is isolated
CN114237971A (en) Erasure code coding layout method and system based on distributed storage system
CN109344012B (en) Data reconstruction control method, device and equipment
CN111857574A (en) Write request data compression method, system, terminal and storage medium
CN116700606A (en) Data storage method, device, equipment and storage medium
CN105488047B (en) Metadata reading/writing method and device
CN112181563A (en) Browser view loading method, device and system based on cloud platform and server
CN115373609A (en) Task processing method and related equipment
US10091298B2 (en) Enhancing performance of data storage in a dispersed storage network
CN106293530B (en) A kind of method for writing data and device
WO2020238653A1 (en) Encoding method in distributed system environment, decoding method in distributed system environment, and corresponding apparatuses
CN114328032A (en) Disaster recovery switching processing method and device based on big data double activities and computer equipment
CN115033551A (en) Database migration method and device, electronic equipment and storage medium
CN112445653A (en) Multi-time-window hybrid fault-tolerant cloud storage method, device and medium
CN103092730B (en) A kind of information storage and read method
GB2565932B (en) Storing data in dispersed storage network with consistency
CN115599315B (en) Data processing method, device, system, equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200414