CN117539383A - Distributed storage architecture for high-load backup system and deployment method thereof - Google Patents
Distributed storage architecture for high-load backup system and deployment method thereof Download PDFInfo
- Publication number
- CN117539383A CN117539383A CN202311333462.0A CN202311333462A CN117539383A CN 117539383 A CN117539383 A CN 117539383A CN 202311333462 A CN202311333462 A CN 202311333462A CN 117539383 A CN117539383 A CN 117539383A
- Authority
- CN
- China
- Prior art keywords
- haproxy
- nodes
- ceph
- rgw
- storage
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000003860 storage Methods 0.000 title claims abstract description 102
- 238000000034 method Methods 0.000 title claims abstract description 36
- 230000008569 process Effects 0.000 claims abstract description 17
- 238000010276 construction Methods 0.000 claims abstract description 5
- 230000036541 health Effects 0.000 claims description 6
- 238000013507 mapping Methods 0.000 claims description 6
- 238000012544 monitoring process Methods 0.000 claims description 5
- 238000009434 installation Methods 0.000 claims description 4
- 238000013515 script Methods 0.000 claims description 3
- 238000010200 validation analysis Methods 0.000 claims description 2
- 230000006870 function Effects 0.000 description 5
- 238000005192 partition Methods 0.000 description 4
- 235000014510 cooky Nutrition 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 101100226364 Arabidopsis thaliana EXT1 gene Proteins 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 239000000835 fiber Substances 0.000 description 2
- 230000006855 networking Effects 0.000 description 2
- 230000008520 organization Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 238000012795 verification Methods 0.000 description 2
- 230000000903 blocking effect Effects 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000005538 encapsulation Methods 0.000 description 1
- 230000003862 health status Effects 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000013508 migration Methods 0.000 description 1
- 230000005012 migration Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000013112 stability test Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/0604—Improving or facilitating administration, e.g. storage management
- G06F3/0607—Improving or facilitating administration, e.g. storage management by facilitating the process of upgrading existing storage systems, e.g. for improving compatibility between host and storage device
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0638—Organizing or formatting or addressing of data
- G06F3/0644—Management of space entities, e.g. partitions, extents, pools
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0668—Interfaces specially adapted for storage systems adopting a particular infrastructure
- G06F3/067—Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1095—Replication or mirroring of data, e.g. scheduling or transport for data synchronisation between network nodes
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1097—Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to a distributed storage architecture for a high-load backup system and a deployment method thereof, wherein the distributed storage architecture comprises a Ceph-RGW storage system, wherein the Ceph-RGW storage system is respectively and interactively connected with a plurality of Haproxy nodes, the Haproxy nodes are respectively and interactively connected with an application end, and the application end is connected with a DNS server; the deployment method comprises the following steps: deploying Ceph-RGW, wherein Ceph provides object storage service for an Internet cloud service provider through RGW; the Haproxy is deployed for realizing load balancing; and deploying keepalive on the Haproxy node, and then deploying DNS service to complete the construction of the distributed storage architecture. Compared with the prior art, the method and the system have the advantages that the haproxy is used for achieving high availability of the haproxy, the haproxy is used for achieving high availability and load balancing of the ceph object storage gateway service, and the distributed network storage system is adopted for expanding the system structure, so that the operation process of the high availability object storage service can be achieved, and the high load backup archiving scene is optimized.
Description
Technical Field
The invention relates to the technical field of distributed storage, in particular to a distributed storage architecture for a high-load backup system and a deployment method thereof.
Background
The traditional file system is a hardware medium for directly accessing stored data, and the medium is not concerned or can not be concerned about the organization mode and structure of the data, so the adopted organization mode is as follows: all data is partitioned into blocks of fixed size, each block being assigned a number for addressing. Taking a mechanical hard disk as an example, one block is a sector, the old hard disk is 512 bytes in size, and the new hard disk is 4 kbytes in size. Older hard disks are addressed with numbers consisting of Cylinder-Head-Sector numbers (CHS), and modern hard disks are addressed with one logical block number (LBA, logical Block Addressing). Therefore, the hard disk is often called a Block Device (Block Device), and other Block devices besides the hard disk, such as floppy disks of different specifications, optical disks of various specifications, magnetic tapes, and the like.
For convenience of management, such block devices as hard disks may be generally divided into a plurality of logical block devices, i.e., hard disk partitions (Partition). In turn, the capacity and performance of a single medium are limited, and multiple physical block devices may be combined into one logical block device by some technical means, such as various levels of RAID, JBOD, volume management systems (Volume Manager) of some operating systems, such as dynamic disks of Windows, LVM of Linux, and the like.
In network storage, a server simulates a local logical block device, which may be a part of a physical block device, a combination of multiple physical block devices, or a part of a combination of multiple physical block devices, or even a file on a local file system, into a block device through a certain protocol, and a remote client uses the same protocol to use the logical block device as a local storage medium, partition the partition, format its own file system, and so on, so as to achieve the purpose of block storage. This way, data is protected by means of Raid, LVM and the like; a plurality of cheap hard disks can be combined, so that the high-capacity logic disk can provide service to the outside, and the capacity is improved; when writing data, the logical disk combined by a plurality of magnetic disks can be used for writing a plurality of hard disks in parallel, so that the reading and writing efficiency is improved; many times, block storage adopts SAN architecture networking, transmission speed and encapsulation protocol, so that the transmission speed and read-write efficiency are improved.
In actual camping, when the SAN architecture is adopted for networking, an additional purchase of a fiber channel card for a host is required, and a fiber switch is also required, so that the manufacturing cost is high; the data between the hosts cannot be shared, the block storage bare disc is mapped to the host under the condition that the server does not make a cluster, and after the formatted use, the block storage bare disc is equivalent to a local disc for the host, so that the local disc of the host A cannot be used for the host B at all, the data cannot be shared, and the data sharing among the hosts of different operating systems is not facilitated. Because operating systems use different file systems, data between different file systems cannot be shared after formatting. For example, a win7 file system is FAT32/NTFS, and linux is EXT4, it being apparent that EXT4 is a file system that cannot identify NTFS.
The performance of the traditional storage is easily affected by the network bandwidth, and the backup system is characterized by high bandwidth and frequent reading and writing ops, so that the pressure of the traditional storage is higher. The management and use modes of different devices among different manufacturers can also bring the phenomenon of lower storage utilization rate because unified management and elastic scheduling of resources cannot be achieved.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provide a distributed storage architecture for a high-load backup system and a deployment method thereof, which can realize the operation process of high-availability object storage service, thereby optimizing the high-load backup archiving scene.
The aim of the invention can be achieved by the following technical scheme: the distributed storage architecture for the high-load backup system comprises a Ceph-RGW storage system, wherein the Ceph-RGW storage system is respectively and interactively connected with a plurality of Haproxy nodes, the Haproxy nodes are interactively connected with an application end, the application end is connected with a DNS server, the Ceph-RGW storage system is used for converting and then performing distributed storage on a request sent by the application end, the Haproxy nodes are used for realizing load balancing, and the DNS server is used for performing domain name resolution.
Further, the Ceph-RGW storage system comprises a Ceph storage cluster and a plurality of RGW nodes, wherein the RGW nodes are respectively and interactively connected with a plurality of Haproxy nodes, the RGW nodes are respectively and interactively connected with the Ceph storage cluster to analyze preset protocol data from a request sent by an application end, and then the corresponding request is sent to the Ceph storage cluster.
Further, the Ceph storage cluster is provided with a plurality of MON nodes and OSD nodes, the MON nodes are used for monitoring health states of the self and other components in the cluster, and the OSD nodes are used for realizing object storage processes.
A distributed storage deployment method for a high load backup system, comprising the steps of:
s1, deploying Ceph-RGW, wherein the Ceph provides object storage service for an Internet cloud service provider through the RGW;
s2, deploying Haproxy for realizing load balancing;
and S3, deploying keepalive on the Haproxy node, and then deploying DNS service to complete the construction of the distributed storage architecture.
Further, the specific process of step S1 is as follows:
s11, deploying a plurality of RGW nodes, and setting a Ceph storage cluster;
s12, respectively and interactively connecting the RGW nodes with the Ceph storage cluster.
Further, the RGW node is provided with a RestAPI for accessing the Ceph storage cluster, so that a request conforming to a preset protocol of an upper layer application is converted into a request of rados, and then data is stored in the rados cluster.
Further, the step S2 is specifically to deploy a plurality of Haproxy nodes, where each Haproxy node is interactively connected with a plurality of RGW nodes.
Further, the process of deploying the Haproxy node in the step S2 is as follows:
logging in nodes needing to be deployed with Haproxy to execute the installation command respectively;
performing a software configuration operation: closing the firewall on the Haproxy node;
modifying a Haproxy profile/etc/Haproxy;
configuring a Haproxy log service, and creating/etc/rsyslog.d/Haproxy.conf content;
restarting the Haproxy and rsyslog services;
and checking and confirming the Haproxy running state.
Further, the specific process of deploying keepalive on the Haproxy node in step S3 is as follows:
installing a keepalive software package on each Haproxy node respectively;
creating a catalog on each Haproxy node;
placing status_Haproxy.sh and Haproxy_master.sh at/var/lib/keepalive/scripts/and modifying rights to be executable;
modifying/etc/keepalive. Conf on one of the Haproxy nodes;
modifying/etc/keepalive. Conf on the remaining Haproxy nodes;
the validation check is performed after the keepalive service is started on all Haproxy nodes.
Further, the specific process of deploying DNS service in step S3 is:
(1) Installing software;
(2) Configuring a firewall;
(3) Modifying the configuration file:
modifying a bind configuration file/etc/name. Conf on the master and slave DNS servers;
creating mapping record files/var/named/cephs3.com zone on the master server and the slave server, respectively;
creating PTR mapping files/var/named/12.0.10. In-addr.arpa.zone on a master server and a slave server respectively;
adding the setting content to a bind configuration file/etc/name.rfc 1912.Zones of the primary DNS;
adding the setting content to the bind profile/etc/name.rfc 1912.Zones from DNS;
restarting the bind service on the master-slave DNS server;
the dns server is configured on a client that needs to access s3 services using obs.cephs 3.Com.
Compared with the prior art, the invention has the following advantages:
1. according to the invention, a Ceph-RGW storage system is arranged, the Ceph-RGW storage system is respectively and interactively connected with a plurality of Haproxy nodes, the Haproxy nodes are respectively and interactively connected with an application end, the application end is connected with a DNS server, and a request sent by the application end is converted and then distributed and stored by using the Ceph-RGW storage system; load balancing is achieved by utilizing Haproxy nodes; domain name resolution is performed using DNS servers. Therefore, the distributed storage scheme suitable for the high-load backup archiving scene can realize high-availability object storage service, not only improves the reliability, availability and access efficiency of storage, but also is easy for subsequent system expansion.
2. When the designed distributed storage architecture is deployed, ceph-RGW is deployed first, and the Ceph provides object storage service for an Internet cloud service provider through the RGW; then, haproxy is deployed for realizing load balancing; and deploying keepalive on the Haproxy node, and finally deploying the DNS service to complete the construction of the distributed storage architecture. Therefore, on one hand, the Haproxy is utilized to realize high availability of Haproxy, and on the other hand, the Haproxy is utilized to realize high availability and load balancing of ceph object storage gateway services, so that storage efficiency and stability are effectively improved.
Drawings
FIG. 1 is a schematic flow chart of the method of the present invention;
FIG. 2 is a schematic diagram of an application process of an embodiment;
FIG. 3 is a schematic diagram illustrating the operation of Ceph-RGW according to an embodiment;
fig. 4 is a schematic diagram of a distributed storage architecture constructed in an embodiment.
Detailed Description
The invention will now be described in detail with reference to the drawings and specific examples.
Examples
The distributed storage architecture for the high-load backup system comprises a Ceph-RGW storage system, wherein the Ceph-RGW storage system is respectively and interactively connected with a plurality of Haproxy nodes, the Haproxy nodes are respectively and interactively connected with an application end, the application end is connected with a DNS server, the Ceph-RGW storage system is used for converting and then performing distributed storage on a request sent by the application end, the Haproxy nodes are used for realizing load balancing, and the DNS server is used for performing domain name resolution.
The Ceph-RGW storage system comprises a Ceph storage cluster and a plurality of RGW nodes, wherein the RGW nodes are respectively and interactively connected with a plurality of Haproxy nodes, the RGW nodes are respectively and interactively connected with the Ceph storage cluster to analyze preset protocol data from a request sent by an application end, and then the corresponding request is sent to the Ceph storage cluster.
The Ceph storage cluster is provided with a plurality of MON nodes and OSD nodes, the MON nodes are used for monitoring the health states of the self and other components in the cluster, and the OSD nodes are used for realizing the object storage process.
In order to realize the above distributed storage architecture, the present invention further provides a distributed storage deployment method for a high-load backup system, as shown in fig. 1, including the following steps:
s1, deploying Ceph-RGW, wherein the Ceph provides object storage service for an Internet cloud service provider through the RGW;
s2, deploying Haproxy for realizing load balancing;
and S3, deploying keepalive on the Haproxy node, and then deploying DNS service to complete the construction of the distributed storage architecture.
By applying the technical scheme, as shown in fig. 2, the main contents are as follows:
1. Ceph-RGW deployment
RGWs are abbreviations for Rados Gateway, and ceph provides object storage services for Internet cloud service providers through RGWs. RGWs provide applications with a RestAPI on top of librados to access ceph clusters, supporting both Amazon S3 and openstack swift interfaces. RGW is a protocol conversion layer that converts requests conforming to S3 or Swift protocol from upper layer applications to requests of rados, and saves data in the rados cluster.
As shown in fig. 3, the application access cluster flow through RGW is as follows:
(1) S3 or the shift application sends a request to the RGW via the http protocol.
(2) RGW parses s3 or swift protocol data from http request, calls librados interface, and sends corresponding request to rados cluster.
The rgw service parameters rgw _dns_name on the rgw node/etc/ceph/ceph.conf need to be configured to support s3 path-style operation socket as follows:
the optimized RGW configuration parameters are as follows, filling in the global portion of/etc/ceph/ceph.conf at the RGW service node:
2. haproxy deployment
Haproxy, as load balancing software, can run on most mainstream Linux operating systems. Haproxy provides two load balancing capabilities of L4 (TCP) and L7 (HTTP), and has rich functions.
Core function of Haproxy
(1) Load balancing: l4 and L7 modes support rich load balancing algorithms such as RR/static RR/LC/IP Hash/URI Hash/URL_PARAM Hash/HTTP_HEADER Hash and the like;
(2) Health examination: supporting two health check modes, namely TCP and HTTP;
(3) Session hold: for application clusters which do not realize session sharing, session maintenance can be realized through an Insert Cookie/a Rewrite Cookie/a Prefix Cookie and the above-mentioned multiple Hash modes;
(4) SSL: the Haproxy can analyze the HTTPS protocol and decrypt the request into HTTP and then transmit the HTTP to the back end;
(5) HTTP request overwriting and redirection;
(6) Monitoring and statistics: haproxy provides a Web-based statistics page that presents health status and traffic data. Based on this function, the user can develop a monitoring program to monitor the state of Haproxy;
key properties of Haproxy:
haproxy adopts a single-thread, event-driven and non-blocking model, reduces the consumption of context switching, can process hundreds of requests within 1ms, and only occupies a few KB of memory per session. A number of sophisticated performance optimizations such as O (1) complexity event checker, delay update techniques, single-buffereing, zero-copy forwarding, etc., have enabled Haproxy to occupy very low CPU resources under moderate load. The Haproxy largely utilizes the functional characteristics of the operating system, so that the Haproxy can exert extremely high performance when processing requests, and under the general condition, the Haproxy occupies only 15 percent of processing time, and the rest 85 percent of the operations are completed in a system kernel layer and need to be respectively operated on all Haproxy nodes.
(1) Logging in the nodes needing to deploy the haproxy respectively execute the installation command.
(2) Software configuration
A. Closing firewall on haproxy node
B. Modifying the haproxy profile/etc/haproxy/haproxy.cfg is as follows:
C. the haproxy log service is configured, and the creation/etc/rsyslog.d/haproxy.conf content is as follows:
D. restarting haproxy and rsyslog services
E. In the embodiment, a browser is used for logging in a haproxy monitoring interface http:// { haproxy_ip }.
3. Keepalive deployment
Keepalive is used to manage and monitor the status of each service node in the LVS cluster system, and then a VRRP function capable of realizing high availability is added. VRRP is an abbreviation of Virtual Router RedundancyProtocol (virtual router redundancy protocol), and the purpose of VRRP is to solve the problem of single point failure of static routing, which can ensure that the entire network can run uninterruptedly when individual nodes crash. Thus, keepalive can be used as a highly available solution software for other services (e.g., nginx, haproxy, mySQL, etc.) in addition to being able to manage LVS software.
Keepalive has three important functions, namely:
A. managing LVS load balancing software;
B. realizing health check of LVS cluster nodes;
C. as a high availability (failover) of system network services.
The distributed storage rack constructed in this embodiment is shown in fig. 4, and two haproxy nodes are set:
(1) Installation of keepalive software packages on two haproxy nodes
(2) Creating a directory on two haproxy nodes
(3) Placing status_haproxy.sh and haproxy_master.sh at/var/lib
Keep/scripts/download and modify rights to executable
(4) Modifying/etc/keepalive. Conf on first haproxy node
(5) Modifying/etc/keepalive. Conf on second haproxy node
(6) Starting keep service on all nodes
(7) And finding out the current node of the VIP, restarting the keepalive service to check the migration state of the VIP.
(8) Stopping the haproxy service on the haproxy node where VIP is located to see if VIP is switched.
4. DNS service deployment
The scheme uses bind to realize dns resolution service. If the dns service exists in the data center, the domain name resolution service is configured by the data center.
(1) Software installation
(2) Configuration firewall
The scheme is illustrated by taking the dns name of the service S3 of obs.cephs3.com as an example. And building a master DNS server and a slave DNS server. And closing the firewall on the master DNS node and the slave DNS node:
(3) Modifying configuration files
A. Modifying bind configuration file/etc/native. Conf on master-slave DNS server
B. Mapping record files/var/named/cephs3.com are created on the master and slave servers, respectively
C. PTR mapping file/var/named/12.0.10. In-addr arpa zone is created on the master-slave server, respectively
D. The following is added to the bind profile/etc/name.rfc 1912.Zones of the primary DNS
E. Add the following to the bind profile/etc/name.rfc 1912.Zones from DNS
F. Restarting bind services on a master-slave DNS server
The dns server is configured on the client that needs to use the obs.cephs3.com to access the s3 service as the two dns servers.
5. Result verification
In order to verify the validity of the distributed storage architecture built by the scheme, the embodiment performs a corresponding verification process:
(1) And using an S3Browser tool to configure an S3 domain name address to be tested to perform the operations of creation of the socket and object uploading deletion, and checking whether the operation is successful.
(2) And closing a certain Ceph object gateway service (RGW) in the process of uploading files by using the S3Browser to check whether uploading tasks are normal.
(3) And closing a haproxy service in the process of uploading the file by using the S3Browser to check whether the uploading task is normal.
(4) All services were started, and the cosbench tool was used to test the performance of the object store and to conduct a long-term stability test for 72 hours to verify service stability.
In summary, the scheme uses the haproxy to realize high availability of the haproxy, uses the haproxy to realize high availability and load balancing of the ceph object storage gateway service, adopts a distributed network storage system expansion system structure, uses a plurality of storage servers to share storage load, and uses the position servers to position storage information, thereby not only improving the reliability, availability and access efficiency of the system, but also being easy to expand. The scheme provides a scheme for realizing high-availability object storage service based on Ceph object storage gateway use, keepalived, bind, haproxy and other open source software, and can effectively optimize a high-load backup archive scene.
Claims (10)
1. The distributed storage architecture for the high-load backup system is characterized by comprising a Ceph-RGW storage system, wherein the Ceph-RGW storage system is respectively and interactively connected with a plurality of Haproxy nodes, the Haproxy nodes are respectively and interactively connected with an application end, the application end is connected with a DNS server, the Ceph-RGW storage system is used for converting and then performing distributed storage on a request sent by the application end, the Haproxy nodes are used for realizing load balancing, and the DNS server is used for performing domain name resolution.
2. The distributed storage architecture for a high-load backup system according to claim 1, wherein the Ceph-RGW storage system comprises a Ceph storage cluster and a plurality of RGW nodes, the RGW nodes are respectively and interactively connected with a plurality of Haproxy nodes, the plurality of RGW nodes are respectively and interactively connected with the Ceph storage cluster, so as to parse preset protocol data from a request sent by an application end, and then send the corresponding request to the Ceph storage cluster.
3. The distributed storage architecture for a high load backup system of claim 2, wherein the Ceph storage cluster is provided with a plurality of MON nodes for monitoring health of itself and other components in the cluster and OSD nodes for implementing object storage processes.
4. A distributed storage deployment method for a high load backup system, comprising the steps of:
s1, deploying Ceph-RGW, wherein the Ceph provides object storage service for an Internet cloud service provider through the RGW;
s2, deploying Haproxy for realizing load balancing;
and S3, deploying keepalive on the Haproxy node, and then deploying DNS service to complete the construction of the distributed storage architecture.
5. The distributed storage deployment method for a high-load backup system according to claim 4, wherein the specific process of step S1 is:
s11, deploying a plurality of RGW nodes, and setting a Ceph storage cluster;
s12, respectively and interactively connecting the RGW nodes with the Ceph storage cluster.
6. The distributed storage deployment method for a high-load backup system according to claim 5, wherein the RGW node is provided with a RestAPI for accessing a Ceph storage cluster to convert a request of an upper layer application conforming to a preset protocol into a request of rados, and then store the data in the rados cluster.
7. The method according to claim 5, wherein the step S2 is specific to deploying a plurality of Haproxy nodes, each Haproxy node being interactively connected to a plurality of RGW nodes.
8. The distributed storage deployment method for a high-load backup system according to claim 7, wherein the process of deploying a Haproxy node in step S2 is:
logging in nodes needing to be deployed with Haproxy to execute the installation command respectively;
performing a software configuration operation: closing the firewall on the Haproxy node;
modifying a Haproxy profile/etc/Haproxy;
configuring a Haproxy log service, and creating/etc/rsyslog.d/Haproxy.conf content;
restarting the Haproxy and rsyslog services;
and checking and confirming the Haproxy running state.
9. The distributed storage deployment method for the high-load backup system according to claim 8, wherein the specific process of deploying keepalive on the Haproxy node in step S3 is as follows:
installing a keepalive software package on each Haproxy node respectively;
creating a catalog on each Haproxy node;
placing status_Haproxy.sh and Haproxy_master.sh at/var/lib/keepalive/scripts/and modifying rights to be executable;
modifying/etc/keepalive. Conf on one of the Haproxy nodes;
modifying/etc/keepalive. Conf on the remaining Haproxy nodes;
the validation check is performed after the keepalive service is started on all Haproxy nodes.
10. The distributed storage deployment method for a high-load backup system according to claim 9, wherein the specific procedure of deploying DNS service in step S3 is as follows:
(1) Installing software;
(2) Configuring a firewall;
(3) Modifying the configuration file:
modifying a bind configuration file/etc/name. Conf on the master and slave DNS servers;
creating mapping record files/var/named/cephs3.com zone on the master server and the slave server, respectively;
creating PTR mapping files/var/named/12.0.10. In-addr.arpa.zone on a master server and a slave server respectively;
adding the setting content to a bind configuration file/etc/name.rfc 1912.Zones of the primary DNS;
adding the setting content to the bind profile/etc/name.rfc 1912.Zones from DNS;
restarting the bind service on the master-slave DNS server;
the dns server is configured on a client that needs to access s3 services using obs.cephs 3.Com.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311333462.0A CN117539383A (en) | 2023-10-16 | 2023-10-16 | Distributed storage architecture for high-load backup system and deployment method thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311333462.0A CN117539383A (en) | 2023-10-16 | 2023-10-16 | Distributed storage architecture for high-load backup system and deployment method thereof |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117539383A true CN117539383A (en) | 2024-02-09 |
Family
ID=89794684
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311333462.0A Pending CN117539383A (en) | 2023-10-16 | 2023-10-16 | Distributed storage architecture for high-load backup system and deployment method thereof |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117539383A (en) |
-
2023
- 2023-10-16 CN CN202311333462.0A patent/CN117539383A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8819383B1 (en) | Non-disruptive realignment of virtual data | |
JP6219420B2 (en) | Configuring an object storage system for input / output operations | |
JP6208207B2 (en) | A computer system that accesses an object storage system | |
US9645764B2 (en) | Techniques for migrating active I/O connections with migrating servers and clients | |
US9058119B1 (en) | Efficient data migration | |
RU2302034C9 (en) | Multi-protocol data storage device realizing integrated support of file access and block access protocols | |
US9128626B2 (en) | Distributed virtual storage cloud architecture and a method thereof | |
US20050210074A1 (en) | Inter-server dynamic transfer method for virtual file servers | |
IES20080508A2 (en) | Network distributed file system | |
WO2005106716A1 (en) | Systems and methods for providing a proxy for a shared file system | |
US8086840B2 (en) | Apparatus, system, and method for improving user boot via a storage area network | |
KR102376152B1 (en) | Apparatus and method for providing storage for providing cloud services | |
US9674312B2 (en) | Dynamic protocol selection | |
US20210334235A1 (en) | Systems and methods for configuring, creating, and modifying parallel file systems | |
US8838768B2 (en) | Computer system and disk sharing method used thereby | |
Koutoupis | The lustre distributed filesystem | |
US20230367677A1 (en) | Using ephemeral storage as backing storage for journaling by a virtual storage system | |
US7756832B1 (en) | Apparatus and method for providing upgrade compatibility | |
CN105022779A (en) | Method for realizing HDFS file access by utilizing Filesystem API | |
US11966370B1 (en) | Pseudo-local multi-service enabled file systems using a locally-addressable secure compute layer | |
CN117539383A (en) | Distributed storage architecture for high-load backup system and deployment method thereof | |
US20190037013A1 (en) | Methods for managing workload throughput in a storage system and devices thereof | |
US10209923B2 (en) | Coalescing configuration engine, coalescing configuration tool and file system for storage system | |
US20190332293A1 (en) | Methods for managing group objects with different service level objectives for an application and devices thereof | |
WO2023246241A1 (en) | Data processing system, data processing method and apparatus, and related device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |