CN113672161A - Storage system and establishing method thereof - Google Patents

Storage system and establishing method thereof Download PDF

Info

Publication number
CN113672161A
CN113672161A CN202010402026.4A CN202010402026A CN113672161A CN 113672161 A CN113672161 A CN 113672161A CN 202010402026 A CN202010402026 A CN 202010402026A CN 113672161 A CN113672161 A CN 113672161A
Authority
CN
China
Prior art keywords
storage
servers
data
rule
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010402026.4A
Other languages
Chinese (zh)
Inventor
徐佳宏
陈华兵
黄金龙
曾珂
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Ipanel TV Inc
Original Assignee
Shenzhen Ipanel TV Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Ipanel TV Inc filed Critical Shenzhen Ipanel TV Inc
Priority to CN202010402026.4A priority Critical patent/CN113672161A/en
Publication of CN113672161A publication Critical patent/CN113672161A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0614Improving the reliability of storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0683Plurality of storage devices

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Quality & Reliability (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses a storage system and an establishment method thereof, wherein the storage system is composed of a plurality of servers, each server comprises a plurality of storage units, all the storage units of all the servers are divided into a plurality of groups of storage combinations according to preset rules and data backup rules, and the preset rules comprise: rule 1: the storage units in the storage module are distributed on all the servers; rule 2: the storage units in the same storage module are preferably selected from the servers with the most remaining disks. Therefore, the storage units in the storage modules in each group of storage combination are distributed in all the servers as evenly as possible, and the storage modules in each group of storage combination at least comprise one backup, so that the influence on the accessibility of the storage system when one or more servers are damaged or cannot be used is greatly reduced, and the high availability of the access system is ensured. In addition, the storage system is provided with a plurality of storage units in each server, which is beneficial to ensuring high storage capacity of the storage system.

Description

Storage system and establishing method thereof
Technical Field
The present application relates to the field of computer application technologies, and in particular, to a storage system and a method for establishing the same.
Background
With the advent of the big data age, data storage and reading became the basis for big data applications and analytics.
At present, the way of realizing a high-capacity storage system is mostly to establish a distributed storage server by using a plurality of servers to achieve the purpose. However, the cost and time period for developing the distributed storage server are many, and it generally takes 40 months (i.e. one developer takes 40 months and two developers take 20 months … …).
In the application process of the distributed storage server, it is also one of the focuses of those skilled in the art to focus on how to ensure high availability of access to the storage data in the system under the condition that the server in the system is always in a power-off state, a restart state or a network is not in communication with the server, and when a program crashes or restarts when a bug occurs, so that an abnormal response is not generated because an access capacity of a certain server overflows due to an "access storm".
Disclosure of Invention
In order to solve the above technical problem, the present application provides a storage system and an establishment method thereof, so as to achieve the purpose of ensuring high storage capacity of the storage system and high availability of data access.
In order to achieve the technical purpose, the embodiment of the application provides the following technical scheme:
a storage system, comprising:
a plurality of servers, each of the servers including a plurality of storage units;
all storage units of all the servers are divided into a plurality of groups of storage combinations according to preset rules and data backup rules; each storage combination comprises M storage modules, 1 check module and N spare storage units; wherein M is a positive integer greater than or equal to 2, and N is an integer greater than or equal to 0; the storage module is a storage unit for storing data, and the verification module is a storage unit for storing verification data;
the preset rules include:
rule 1: the storage units in the storage module are distributed on all the servers;
rule 2: the storage units in the same storage module are preferentially selected from the servers with the most residual disks;
the data backup rules include X backup, wherein X is a positive integer greater than or equal to 1.
Optionally, the storage unit includes a single solid state disk or a disk group composed of a plurality of solid state disks.
Optionally, when there are a plurality of grouping schemes satisfying rule 1 and rule 2, the preset rule further includes:
rule 3: when half of the servers in the plurality of servers are broken, the data access hit rate is highest among all the grouping schemes.
Optionally, the verification module is configured to store verification data, where the verification data is used to verify preset data;
the preset data includes: and the data is stored in other servers except the server where the verification module is located.
Optionally, the method further includes: an access request positioning module;
the access request positioning module is used for acquiring an access request, determining a storage combination to be accessed according to the access request, and accessing target data corresponding to the access request in the storage combination to be accessed; and for synchronizing the target data into the storage system when any of the servers in the storage portfolio to be accessed does not include the target data.
Optionally, the access request positioning module determines, according to the access request, that the storage combination to be accessed is specifically used for performing hash calculation on a uniform resource locator included in the access request to obtain a hash value, and performing modulo operation on the number of the storage combinations by using the hash value to obtain an identifier of the storage combination to be accessed.
A method for establishing a storage system comprises the following steps:
providing a plurality of servers, each of the servers comprising a plurality of storage units;
grouping all storage units of all the servers according to a preset rule and a data backup rule to obtain a plurality of groups of storage combinations; each storage combination comprises M storage modules, 1 check module and N spare storage units; wherein M is a positive integer greater than or equal to 2, and N is an integer greater than or equal to 0; the storage module is a storage unit for storing data, and the verification module is a storage unit for storing verification data;
the preset rules include:
rule 1: the storage units in the storage module are distributed on all the servers;
rule 2: the storage units in the same storage module are preferentially selected from the servers with the most residual disks;
the data backup rules include X backup, wherein X is a positive integer greater than or equal to 1.
Optionally, the storage unit includes a single solid state disk or a disk group composed of a plurality of solid state disks.
Optionally, when there are a plurality of grouping schemes satisfying rule 1 and rule 2, the preset rule further includes:
rule 3: when half of the servers in the plurality of servers are broken, the data access hit rate is highest among all the grouping schemes.
Optionally, the method further includes:
storing verification data in the verification module, wherein the verification data is used for verifying preset data;
the preset data includes: and the data is stored in other servers except the server where the verification module is located.
It can be seen from the foregoing technical solutions that, an embodiment of the present application provides a storage system and an establishment method thereof, where the storage system is composed of a plurality of servers, each server includes a plurality of storage units, all the storage units of all the servers are divided into a plurality of groups of storage combinations according to preset rules and data backup rules, and the preset rules include: rule 1: the storage units in the storage module are distributed on all the servers; rule 2: the storage units in the same storage module are preferably selected from the servers with the most remaining disks. Therefore, the storage units in the storage modules in each group of storage combination are distributed in all the servers as evenly as possible, and the storage modules in each group of storage combination at least comprise one backup, so that the influence on the accessibility of the storage system when one or more servers are damaged or cannot be used is greatly reduced, and the high availability of the access system is ensured.
In addition, the storage system is provided with a plurality of storage units in each server, which is beneficial to ensuring the high storage capacity of the storage system.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, it is obvious that the drawings in the following description are only embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
Fig. 1 is a schematic structural diagram of a storage system according to an embodiment of the present application;
FIG. 2 is a block diagram of a RAID5 architecture according to an embodiment of the present application;
FIG. 3 is a schematic structural diagram of a storage system according to another embodiment of the present application;
fig. 4 is a flowchart illustrating a method for establishing a storage system according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
An embodiment of the present application provides a storage system, as shown in fig. 1, including:
a plurality of servers 20, each of the servers 20 including a plurality of storage units 10;
all the storage units 10 of all the servers 20 are divided into a plurality of groups of storage combinations according to preset rules and data backup rules; each storage combination comprises M storage modules, 1 check module and N spare storage units 10; wherein M is a positive integer greater than or equal to 2, and N is an integer greater than or equal to 0; the storage module is a storage unit 10 for storing data, and the verification module is a storage unit 10 for storing verification data;
the preset rules include:
rule 1: the storage units 10 in the storage module are distributed on all the servers 20;
rule 2: the storage units 10 in the same storage module are preferably selected from the servers 20 with the most residual disks;
the data backup rules include X backup, wherein X is a positive integer greater than or equal to 1.
In this embodiment, as shown in fig. 2, the storage modules and the check modules in each of the storage assemblies may form a RAID5 (distributed parity independent disk structure) array, for example, when one storage assembly includes 3 storage units 10, two of the storage units 10 are used as storage modules, and the remaining one storage unit 10 is used as a check module, where the number of spare storage units 10 in the storage assembly is 0. For example, when a storage combination includes 8 storage units 10, six storage units 10 may be used as a storage module, one storage unit 10 is used as a check module, and the last storage unit 10 is used as a spare storage unit 10, where the spare storage unit 10 may perform data recovery according to check data stored in the check module and data stored in other storage units 10 in the storage combination when one storage unit 10 is damaged or fails, so as to replace the damaged storage unit 10, thereby maintaining high reliability of the storage combination. Meanwhile, RAID5 can also improve the read-write performance of data. In fig. 2, Disk 0, Disk 1, Disk 2, and Disk 3 respectively represent different storage units 10, a1, B1, and C1 represent storage data in the storage unit 10Disk 0, a2, B2, and D1 represent storage data in the storage unit 10Disk 1, A3, C2, and D2 represent storage data in Disk 2, B3, C3, and D3 represent storage data in Disk 3, Dp represents check data in Disk 0, Cp represents check data in Disk 1, Bp represents check data in Disk 2, and Ap represents check data in Disk 3.
Still referring to fig. 2, in each storage combination, the check module is configured to store check data, and the check data is used for checking preset data;
the preset data includes: data stored in a server 20 other than the server 20 where the check module is located.
The check data may include parity information, and when a storage unit 10 in a storage combination is damaged, the storage unit 10 may be replaced with a spare storage unit 10, and when the damaged storage unit 10 is replaced, the data on the replaced storage unit 10 may be reconstructed by using parity information remaining in other storage units 10 in the same storage combination, so as to maintain high reliability of the storage combination.
Generally, the number of storage units 10 that can be mounted in each server 20 is limited, and each server 20 can mount 24 solid state disks, for example, a single solid state disk.
Alternatively, in order to improve the efficiency of the reading and writing speed, the storage unit 10 may include a single solid state hard disk or a disk group composed of a plurality of solid state hard disks. The current reading speed of the solid state disk can reach 500MB/S, and the response speed of the system can be greatly improved in the starting, loading and transmission speeds; the writing speed can reach 450MB/S, and the writing speed can not become a performance bottleneck when media assets are downloaded from a source station and written into a storage system, and can be better matched with the network downloading speed and the writing speed.
Assuming that the storage capacity of a solid state disk is 960GB, assuming that 24 solid state disks are inserted into a disk card of a single server 20, when the solid state disks are not grouped in RAID5, the space of each solid state disk is its memory capacity, i.e. 960GB, and when a RAID5 array is used, and M is 6, the available disk space is: the capacity of the storage combination is 5759GB, namely 6 × 960GB 5760 GB.
In addition, when a plurality of the storage combinations are obtained in groups, the storage combinations may be marked as IDs starting from 1. For example, when the number of storage combinations is 6, the IDs of the 6 storage combinations may be 1, 2, 3, 4, 5, 6, respectively.
On the basis of the above embodiment, in an embodiment of the present application, when the grouping scheme satisfying rule 1 and rule 2 is multiple, the preset rule further includes:
rule 3: when half of the servers 20 of the plurality of servers 20 are damaged, the data access hit rate is highest among all the grouping schemes.
The data access hit rate is the ratio of the number of the storage combinations which can be hit to the total number of the storage combinations, and generally one decimal number is reserved unless the number is endless.
When the total number of servers 20 is an even number, then the maximum number of broken servers 20 does not exceed half the total number of servers 20, i.e., 50%; when the number of total servers 20 is an odd number, then the maximum number of servers 20 that are broken does not exceed half of the total servers 20 plus 1.
Namely, the maximum server 20 failure rate calculation formula is:
total servers 20/2 + ((total servers 20% 2) ═ 00: 1);
wherein,% represents the operation of complementation; the sum of (total server 20 number% 2) = 00:1 indicates whether the result obtained by subtracting 2 from the total server 20 number is equal to 0, if so, the value ((total server 20 number% 2) = 00:1) is 0, otherwise, the value ((total server 20 number% 2) = 00:1) is 1.
The following distance description is made for a specific grouping rule:
if there are 3 servers 20, each server 20 has 4 solid state disks, and the backup rule adopted is 3 backups, that is, the data stored in the storage module in each group of storage combination has 3 backups, and 4 solid state disks are used as a group, then there are 3 servers 20, 12 solid state disks in total, the number of groups is 4, and the number of spare storage units 10 is 0.
Then the possible grouping situation when grouping may be as follows:
server 20 name- -disk group- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -disk group of disk group
[cdnvss1] [1][1][2][2]
[cdnvss2] [1][2][3][3]
[cdnvss3] [1][2][3][3]
Wherein, cdnvss1, cdnvss2, and cdnvss3 respectively represent names of 3 servers 20, and [ X ] in a disk group represents that the solid state disk of the server 20 is divided into storage combinations with ID X, where X represents a group number and a value starts from 1, for example, [1] represents that the solid state disk is divided into storage combinations with ID 1.
In this grouping scheme, when one server 20 is broken, the access data hit rate is 100.0%, and when two servers 20 are broken, the access data hit rate (simply, hit rate) is 66.7%.
In this grouping scheme, any disk has a backup on another server 20, but the grouping [3] No. has no backup on the [ cdnvss1] server 20, i.e., rule 2 is not satisfied. It is desirable to also satisfy the allocation by preferentially selecting from the servers 20 with the large number of remaining disks. If the last packet [2] number was selected with preference either [ cdnvss1] or [ cdnvss2], then the assignment will be no problem.
According to the adjustment, the final dynamic allocation result is as follows:
server 20 name- -disk group- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -disk group of disk group
[cdnvss1] [1][1][2][3]
[cdnvss2] [1][2][2][3]
[cdnvss3] [1][2][3][3]
At this time, when 1 server 20 was broken, the hit rate was 100.0%
When 2 servers 20 are damaged, the hit rate is still 100.0 percent
The best embodiment is that when 50% of servers 20 are damaged, namely 2 servers are damaged, the hit rate is still 100%.
For example, if there are 4 servers 20, each server 20 has 3 solid state disks, and the backup rule adopted is 1 backup, that is, the data stored in the storage module in each storage combination has 1 backup, and 2 solid state disks are used as one group, then there are 3 servers 20, 12 solid state disks, 4 groups, and 0 spare storage unit 10.
The grouping schemes that are possible at this time include:
scheme (1):
server 20 name- -disk group- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -disk group of disk group
[cdnvss1] [1][4][5]
[cdnvss2] [1][2][5]
[cdnvss3] [2][3][6]
[cdnvss4] [3][4][6]
In this scheme, when one server 20 is broken, the hit rate is 100.0%;
when both servers 20 are broken, the hit rate is 66.7%.
Scheme (1) satisfies both rule 1 and rule 2, but there are other grouping schemes, which require rule 3 to be considered.
Scheme (2):
server 20 name- -disk group- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -disk group of disk group
[cdnvss1] [1][2][3]
[cdnvss2] [1][4][5]
[cdnvss3] [2][4][6]
[cdnvss4] [3][5][6]
In this scheme, when one server 20 is broken, the hit rate is 100.0%;
when two servers 20 are broken, the hit rate is 100%.
Then scheme (2) is better according to rule 3 and can be taken as the final grouping scheme.
On the basis of the above-mentioned embodiment, in another embodiment of the present application, as shown in fig. 3, the storage system further includes:
an access request location module 30;
the access request positioning module 30 is configured to obtain an access request, determine a storage combination to be accessed according to the access request, and access target data corresponding to the access request in the storage combination to be accessed; and for synchronizing the target data to the storage system when any of the servers 20 in the storage portfolio to be accessed does not include the target data.
Specifically, the access request positioning module 30 determines, according to the access request, that the storage combination to be accessed is specifically used for performing hash calculation on the uniform resource locator included in the access request to obtain a hash value, and performs modulo operation on the number of the storage combinations by using the hash value to obtain the identifier of the storage combination to be accessed.
In this embodiment, specific access request positioning is defined, and specifically, still taking a RAID5 array as an example, 24 solid state disks are divided into 3 groups, 8 solid state disks exist in each storage combination, a RAID5 array of 6+1 is adopted, and the remaining 3 solid state disks are used as backup storage units. When data is cached, three groups of storage combinations are divided into the same storage space size, and RAID5 arrays are made on the selected 6+1 solid state disks. In terms of 960GB capacity per solid state disk, 6 × 960GB is 4.8TB when the capacity space of one group of storage assemblies is, then the total capacity space of 3 × 4.8T is 14.4 TB. For a group of solid state disks which are arrayed, one solid state disk is named directly and is not repeated.
When four servers are constructed as a storage system (cluster), and RAID5(6+1) arrays are combined into a disk group with a large capacity, then the single server provides 3 disk groups, 4 servers provide 12 groups, and the double backup calculation can be divided into 6 groups, each two groups provides the same storage data, so as to achieve better access balance and high availability.
The VSS service program and the CDNVSS sink service program on each server are deployed on the same server, and in order to reduce the interaction of internal traffic, it is not allowed to output traffic from the CDNVSS sink service program of server a but output traffic from the VSS program of server B or C to the client. And finally, the VSS program and the CDNVSS sinking service exchange data on an internal network card on the same server and output the data, namely the VSS-X and the CDNVSS-X sinking service programs are numbered, and the values with the same number of X are deployed on the same server.
When a client initiates a request and a server of the CDN cluster receives a request URL, the URL is subjected to hash calculation to obtain a hash value, and then a remainder value is obtained according to modulo operation. With 4 servers, each with 3 solid state disks, the grouping number in dual backup is 3 × 4/2 — 6 GROUPs, i.e. GROUP _ ID value range [1 … 6 ].
Calculating the formula:
HASH (URL)% N ═ GROUP _ ID (where the value of N equals the ID value of the largest storage combination, i.e. 6)
When requesting a program to the VSS-1 program on the a server, and when finding that the target data does not exist by looking at the database, the target data needs to be synchronized to the storage system by the CDNVSS-1 sinking service program at this time.
In this embodiment, it is only necessary to know that the target data needs to be returned to the source when the target data is not in the cluster, and synchronize the target data of the source station to the storage system.
Initiating a request to a VSS-1 program, and when the media resources do not exist, sending the request to a CDNVSS-1 sink service program according to a one-to-one correspondence relationship, calculating the hash value of the URL of the media resources at the moment, obtaining HASH (URL) 9275293, and obtaining grouping positioning according to a calculation formula: 9275293MOD6 ═ 1.VSS-1 program is located to A server (CDNVSS-1 sink service program) through Hash calculation and modulus, at this time, the server can receive request service according to the identifier GROUP _ ID ═ 1 of the storage combination to be accessed, at this time, the corresponding target data is read and output according to the root path + access path of the disk.
When a program is requested to the VSS-4 program on the D-server, the asset is found to be absent by looking at the database. When a request is sent to the CDNVSS-4 sink service program on the current server, hash (url) 9275293 is obtained according to the computing company, and the packet location is obtained according to the computing formula: 9275293MOD6 equals 1, i.e. the memory combination needs to be looked up with GROUP _ ID equal to 1. According to the above grouping information, the corresponding storage combination ID value cannot be obtained, the storage combination with GROUP _ ID of 1 is obtained by querying and exists in both servers a and B, at this time, the simplest polling policy or random policy is selected, at this time, server B is selected, and the VSS-2 service program is notified of the information through 302. The VSS-4 service on the D server will make a request to the VSS-2 service on the B server. And then positioning to a specific root path of the disk according to the same calculation formula and reading and outputting corresponding target data through the access path.
The following describes a method for establishing a storage system provided in an embodiment of the present application, and the method for establishing a storage system described below may be referred to in correspondence with the storage system described above.
Correspondingly, an embodiment of the present application provides a method for establishing a storage system, and with reference to fig. 4, the method for establishing a storage system includes:
s101: providing a plurality of servers, each of the servers comprising a plurality of storage units;
s102: grouping all storage units of all the servers according to a preset rule and a data backup rule to obtain a plurality of groups of storage combinations; each storage combination comprises M storage modules, 1 check module and N spare storage units; wherein M is a positive integer greater than or equal to 2, and N is an integer greater than or equal to 0; the storage module is a storage unit for storing data, and the verification module is a storage unit for storing verification data;
the preset rules include:
rule 1: the storage units in the storage module are distributed on all the servers;
rule 2: the storage units in the same storage module are preferentially selected from the servers with the most residual disks;
the data backup rules include X backup, wherein X is a positive integer greater than or equal to 1.
Optionally, the storage unit includes a single solid state disk or a disk group composed of a plurality of solid state disks.
Optionally, when there are a plurality of grouping schemes satisfying rule 1 and rule 2, the preset rule further includes:
rule 3: when half of the servers in the plurality of servers are broken, the data access hit rate is highest among all the grouping schemes.
Optionally, the method further includes:
storing verification data in the verification module, wherein the verification data is used for verifying preset data;
the preset data includes: and the data is stored in other servers except the server where the verification module is located.
To sum up, the embodiment of the present application provides a storage system and an establishment method thereof, wherein the storage system is composed of a plurality of servers, each server includes a plurality of storage units, all the storage units of the servers are divided into a plurality of groups of storage combinations according to preset rules and data backup rules, and the preset rules include: rule 1: the storage units in the storage module are distributed on all the servers; rule 2: the storage units in the same storage module are preferably selected from the servers with the most remaining disks. Therefore, the storage units in the storage modules in each group of storage combination are distributed in all the servers as evenly as possible, and the storage modules in each group of storage combination at least comprise one backup, so that the influence on the accessibility of the storage system when one or more servers are damaged or cannot be used is greatly reduced, and the high availability of the access system is ensured.
In addition, the storage system is provided with a plurality of storage units in each server, which is beneficial to ensuring the high storage capacity of the storage system.
Features described in the embodiments in the present specification may be replaced with or combined with each other, each embodiment is described with a focus on differences from other embodiments, and the same and similar portions among the embodiments may be referred to each other.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. A storage system, comprising:
a plurality of servers, each of the servers including a plurality of storage units;
all storage units of all the servers are divided into a plurality of groups of storage combinations according to preset rules and data backup rules; each storage combination comprises M storage modules, 1 check module and N spare storage units; wherein M is a positive integer greater than or equal to 2, and N is an integer greater than or equal to 0; the storage module is a storage unit for storing data, and the verification module is a storage unit for storing verification data;
the preset rules include:
rule 1: the storage units in the storage module are distributed on all the servers;
rule 2: the storage units in the same storage module are preferentially selected from the servers with the most residual disks;
the data backup rules include X backup, wherein X is a positive integer greater than or equal to 1.
2. The storage system of claim 1, wherein the storage unit comprises a single solid state hard disk or a disk group consisting of a plurality of solid state hard disks.
3. The storage system according to claim 1, wherein when the grouping scheme satisfying rule 1 and rule 2 is plural, the preset rule further includes:
rule 3: when half of the servers in the plurality of servers are broken, the data access hit rate is highest among all the grouping schemes.
4. The storage system according to claim 1, wherein the verification module is configured to store verification data, and the verification data is used for verifying preset data;
the preset data includes: and the data is stored in other servers except the server where the verification module is located.
5. The storage system of claim 1, further comprising: an access request positioning module;
the access request positioning module is used for acquiring an access request, determining a storage combination to be accessed according to the access request, and accessing target data corresponding to the access request in the storage combination to be accessed; and for synchronizing the target data into the storage system when any of the servers in the storage portfolio to be accessed does not include the target data.
6. The storage system according to claim 5, wherein the access request location module determines, according to the access request, a storage combination to be accessed, and is specifically configured to perform hash calculation on a uniform resource locator included in the access request to obtain a hash value, and perform modulo operation on the number of the storage combinations by using the hash value to obtain the identifier of the storage combination to be accessed.
7. A method for establishing a storage system, comprising:
providing a plurality of servers, each of the servers comprising a plurality of storage units;
grouping all storage units of all the servers according to a preset rule and a data backup rule to obtain a plurality of groups of storage combinations; each storage combination comprises M storage modules, 1 check module and N spare storage units; wherein M is a positive integer greater than or equal to 2, and N is an integer greater than or equal to 0; the storage module is a storage unit for storing data, and the verification module is a storage unit for storing verification data;
the preset rules include:
rule 1: the storage units in the storage module are distributed on all the servers;
rule 2: the storage units in the same storage module are preferentially selected from the servers with the most residual disks;
the data backup rules include X backup, wherein X is a positive integer greater than or equal to 1.
8. The method of claim 7, wherein the storage unit comprises a single solid state hard disk or a disk group consisting of a plurality of solid state hard disks.
9. The method according to claim 7, wherein when the grouping scheme satisfying rule 1 and rule 2 is plural, the preset rule further comprises:
rule 3: when half of the servers in the plurality of servers are broken, the data access hit rate is highest among all the grouping schemes.
10. The method of claim 7, further comprising:
storing verification data in the verification module, wherein the verification data is used for verifying preset data;
the preset data includes: and the data is stored in other servers except the server where the verification module is located.
CN202010402026.4A 2020-05-13 2020-05-13 Storage system and establishing method thereof Pending CN113672161A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010402026.4A CN113672161A (en) 2020-05-13 2020-05-13 Storage system and establishing method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010402026.4A CN113672161A (en) 2020-05-13 2020-05-13 Storage system and establishing method thereof

Publications (1)

Publication Number Publication Date
CN113672161A true CN113672161A (en) 2021-11-19

Family

ID=78536883

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010402026.4A Pending CN113672161A (en) 2020-05-13 2020-05-13 Storage system and establishing method thereof

Country Status (1)

Country Link
CN (1) CN113672161A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117093150A (en) * 2023-08-24 2023-11-21 合芯科技(苏州)有限公司 Storage system, and configuration method and device of storage system

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117093150A (en) * 2023-08-24 2023-11-21 合芯科技(苏州)有限公司 Storage system, and configuration method and device of storage system
CN117093150B (en) * 2023-08-24 2024-02-09 合芯科技(苏州)有限公司 Storage system, and configuration method and device of storage system

Similar Documents

Publication Publication Date Title
US10691366B2 (en) Policy-based hierarchical data protection in distributed storage
US20100161564A1 (en) Cluster data management system and method for data recovery using parallel processing in cluster data management system
US8554994B2 (en) Distributed storage network utilizing memory stripes
CN102523234B (en) A kind of application server cluster implementation method and system
CN106210147B (en) Load balancing method and device based on polling
US9563598B2 (en) Dispersed storage network frame protocol header
US7203871B2 (en) Arrangement in a network node for secure storage and retrieval of encoded data distributed among multiple network nodes
US7266716B2 (en) Method and recovery of data using erasure coded data from stripe blocks
US8930501B2 (en) Distributed data storage system and method
US20140181116A1 (en) Method and device of cloud storage
CN101916289B (en) Method for establishing digital library storage system supporting mass small files and dynamic backup number
US20050091451A1 (en) Methods of reading and writing data
US11442827B2 (en) Policy-based hierarchical data protection in distributed storage
US10067719B1 (en) Methods and systems for storing and accessing data in a distributed data storage system
US10067831B2 (en) Slice migration in a dispersed storage network
WO2016180049A1 (en) Storage management method and distributed file system
US20210216231A1 (en) Method, electronic device and computer program product for rebuilding disk array
CN109783564A (en) Support the distributed caching method and equipment of multinode
CN115756955A (en) Data backup and data recovery method and device and computer equipment
EP3933647A1 (en) Consensus methods and systems in consortium blockchain
CN113672161A (en) Storage system and establishing method thereof
CN113467753A (en) Distributed non-repetitive random sequence generation method and system
EP3349416B1 (en) Relationship chain processing method and system, and storage medium
CN116974489A (en) Data processing method, device and system, electronic equipment and storage medium
CN113051428A (en) Method and device for storing and backing up front end of camera

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination