CN113220519B - High-availability host system - Google Patents
High-availability host system Download PDFInfo
- Publication number
- CN113220519B CN113220519B CN202110591338.9A CN202110591338A CN113220519B CN 113220519 B CN113220519 B CN 113220519B CN 202110591338 A CN202110591338 A CN 202110591338A CN 113220519 B CN113220519 B CN 113220519B
- Authority
- CN
- China
- Prior art keywords
- host
- coupler
- physical
- physical host
- host module
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 239000013307 optical fiber Substances 0.000 claims abstract description 20
- 230000008878 coupling Effects 0.000 claims abstract description 19
- 238000010168 coupling process Methods 0.000 claims abstract description 19
- 238000005859 coupling reaction Methods 0.000 claims abstract description 19
- 238000003860 storage Methods 0.000 claims description 70
- 238000004519 manufacturing process Methods 0.000 claims description 69
- 238000000034 method Methods 0.000 claims description 15
- 238000012423 maintenance Methods 0.000 abstract description 9
- 238000005192 partition Methods 0.000 description 18
- 101001074449 Crotalus durissus terrificus Phospholipase A2 inhibitor CNF Proteins 0.000 description 9
- 230000008569 process Effects 0.000 description 8
- 238000010586 diagram Methods 0.000 description 6
- 238000005516 engineering process Methods 0.000 description 5
- 238000004891 communication Methods 0.000 description 4
- 230000009471 action Effects 0.000 description 3
- 238000009826 distribution Methods 0.000 description 3
- 230000003993 interaction Effects 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 102100035688 Guanylate-binding protein 1 Human genes 0.000 description 2
- 102100028541 Guanylate-binding protein 2 Human genes 0.000 description 2
- 102100028543 Guanylate-binding protein 3 Human genes 0.000 description 2
- 101001001336 Homo sapiens Guanylate-binding protein 1 Proteins 0.000 description 2
- 101001058858 Homo sapiens Guanylate-binding protein 2 Proteins 0.000 description 2
- 101001058854 Homo sapiens Guanylate-binding protein 3 Proteins 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000006978 adaptation Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000009434 installation Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/22—Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing
- G06F11/26—Functional testing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F1/00—Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
- G06F1/04—Generating or distributing clock signals or signals derived directly therefrom
- G06F1/06—Clock generators producing several clock signals
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F13/00—Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
- G06F13/38—Information transfer, e.g. on bus
- G06F13/40—Bus structure
- G06F13/4004—Coupling between buses
- G06F13/4027—Coupling between buses using bus bridges
- G06F13/4031—Coupling between buses using bus bridges with arbitration
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/25—Integrating or interfacing systems involving database management systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/20—Administration of product repair or maintenance
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- Human Resources & Organizations (AREA)
- Quality & Reliability (AREA)
- Computer Hardware Design (AREA)
- Databases & Information Systems (AREA)
- Economics (AREA)
- Entrepreneurship & Innovation (AREA)
- Marketing (AREA)
- Operations Research (AREA)
- Strategic Management (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Data Mining & Analysis (AREA)
- Hardware Redundancy (AREA)
Abstract
The invention discloses a high-availability host system, which comprises a first host module and a second host module which are arranged in different buildings, wherein the first host module adopts a preset deployment mode and is communicated and interconnected with the second host module through a physical corridor and a physical optical fiber arranged in a vertical shaft which are established in advance, the first host module and the second host module form a set of parallel coupling bodies, and if any host module in the first host module and the second host module fails, the host module which does not fail takes over the service operation work of the host module which fails. Through the scheme, when any one of the first host module and the second host module fails, a plurality of parallel coupling body environments are not required to be built in different buildings to execute business operation work, and only the host module which does not fail is required to take over the business operation work of the host module which fails, so that the operation and maintenance cost is reduced, and the purpose of meeting high-availability operation requirements in a failure scene is achieved.
Description
Technical Field
The present invention relates to the field of international business machine (International Business Machines, IBM) Z mainframe deployment technology, and more particularly to a high availability host system.
Background
In recent years, with the rapid development of internet technology, the application of high availability technology in various fields is becoming wider and wider, the size of high availability clusters is also becoming larger, and the number of hosts of the high availability clusters is increasing from several to tens or even hundreds.
In order to achieve service continuity, conventional hosts generally employ virtualization technologies, parallel computing and load balancing technologies, and the like to perform high-availability deployment. In a specific high availability deployment scheme, hardware devices of all host platforms are deployed in the same host module, or hardware devices of all host platforms are deployed in host modules which are adjacent to each other in the same building room. However, the deployment scheme cannot cope with the situation that the fault scene such as power interruption of the whole building, collapse of the building and the like cannot be repaired in a short period, so that the high-availability operation requirement cannot be met. Although the continuity requirement of the service can be supported by setting up a plurality of parallel coupling environments in a plurality of buildings, the operation and maintenance cost is high due to the purchase of software and hardware of a host platform.
Therefore, the existing high-availability host deployment scheme cannot meet the high-availability operation requirement in the fault scenario, and the operation and maintenance cost is high.
Disclosure of Invention
The embodiment of the invention discloses a high-availability host system, which realizes the purposes of reducing operation and maintenance cost and meeting high-availability operation requirements in a fault scene.
In order to achieve the above purpose, the technical scheme disclosed by the method is as follows:
the first aspect of the present invention discloses a high availability host system comprising a first host module and a second host module disposed in different buildings;
the first host module adopts a preset deployment mode, and is in data interconnection with the second host module through a physical corridor and physical optical fibers arranged in a vertical shaft, the first host module and the second host module form a parallel coupling body, the first host module and the second host module are respectively deployed in two buildings with X distances, and the value of X is less than or equal to 150 meters;
if any one of the first host module and the second host module fails, the host module which does not fail takes over the service operation work of the failed host module.
Preferably, the first host module adopts a preset deployment mode, and performs data interconnection with the second host module through a physical corridor established in advance and a physical optical fiber arranged in a vertical shaft, and the method includes:
the first host module adopts a corridor top end deployment mode and a corridor bottom end deployment mode, and performs data interconnection with the second host module through physical optical fibers arranged in a pre-established physical corridor;
and the first host module adopts a vertical shaft deployment mode and is in data interconnection with the second host module through a physical optical fiber arranged in a pre-established vertical shaft.
Preferably, the first host module comprises a first coupler, a first physical host, a third physical host, a K2 disk, a master clock server and a disk storage production master;
the master clock server is arranged on the first coupler;
the first coupler is respectively connected with the first physical host and the third physical host;
the K2 disk is connected with the first physical host;
the disk storage production master is respectively connected with the first physical host and the third physical host.
Preferably, the second host module comprises a second coupler, a second physical host, a fourth physical host, a K1 disk, a standby clock server, an arbitration clock server and a disk storage production slave disk;
the standby clock server is arranged on the second coupler;
the arbitration clock server is arranged on the fourth physical host;
the second coupler is respectively connected with the second physical host and the fourth physical host;
the K1 magnetic disk is connected with the second physical host;
the disk storage production slave disk is connected to the second physical host and the fourth physical host, respectively.
Preferably, the first coupler is connected with the second coupler, and if the first coupler fails, the second coupler takes over the service operation work of the first coupler;
the first coupler is respectively connected with the first physical host, the second physical host, the third physical host and the fourth physical host;
the second coupler is respectively connected with the first physical host, the second physical host, the third physical host and the fourth physical host;
the disk storage production master is respectively connected with the first physical host, the second physical host, the third physical host and the fourth physical host;
the disk storage production slave disk is respectively connected with the first physical host, the second physical host, the third physical host and the fourth physical host;
the master clock server and the standby clock server are subjected to data interconnection, and if the master clock server fails, the arbitration clock server sends a failure reminding instruction to the standby clock server so that the standby clock server takes over the service operation work of the master clock server;
and the disk storage production master disk is connected with the disk storage production slave disk through a disk mirror image, and if the disk storage production master disk fails, the disk storage production slave disk takes over the service operation work of the disk storage production master disk.
Preferably, the type structures of the first coupler and the second coupler are LOCK type structures, a database global LOCK DB2 LOCK1 and a database component DB2 SCA are arranged in the first coupler, and a global LOCK GRS is arranged in the second coupler.
Preferably, the type structures of the first coupler and the second coupler are cache type structures, and the first coupler and the second coupler set up a plurality of database global buffer pools in an asynchronous duplex mode.
Preferably, the type structures of the first coupler and the second coupler are list type structures, the first coupler sets list type structures ixcsr 1, ixcsr 3 and ixcsr 5 in a symmetrical deployment mode, and the second coupler sets list type structures ixcsr 2, ixcsr 4 and ixcsr 6 in a symmetrical deployment mode.
Preferably, a disk mirror hot-switching control system GDPS K2 is disposed in the first physical host.
Preferably, a disk mirror hot-switching control system GDPS K1 is disposed in the second physical host.
According to the technical scheme, the system comprises the first host module and the second host module which are arranged in different buildings, the first host module adopts a preset deployment mode and is communicated and interconnected with the second host module through the physical corridor and the physical optical fibers arranged in the vertical shaft which are established in advance, the first host module and the second host module form a set of parallel coupling body, and if any host module in the first host module and the second host module fails, the host module which does not fail takes over the service operation work of the host module which fails. Through the scheme, when any one of the first host module and the second host module fails, a plurality of parallel coupling body environments are not required to be built in different buildings to execute business operation work, and only the host module which does not fail is required to take over the business operation work of the host module which fails, so that the operation and maintenance cost is reduced, and the purpose of meeting high-availability operation requirements in a failure scene is achieved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present invention, and that other drawings can be obtained according to the provided drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of a high availability host system according to an embodiment of the present invention;
FIG. 2 is a schematic deployment diagram of a first host module and a second host module between buildings according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of another high availability host system according to an embodiment of the present invention;
FIG. 4 is a schematic deployment diagram of a first coupler and a second coupler according to the type structure of the couplers disclosed in the embodiment of the invention;
FIG. 5 is a schematic diagram of another embodiment of a high availability host system.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
In this application, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
As known from the background art, the existing high-availability host deployment scheme cannot meet the high-availability operation requirement in fault scenes such as fault scenes which cannot be repaired in a short period, such as power interruption of the whole building, building collapse and the like, and has high operation and maintenance cost.
In order to achieve the above-mentioned purpose, the embodiment of the present invention discloses a high-availability host system, when any one of a first host module and a second host module fails, it is unnecessary to build multiple sets of parallel coupling environments in different buildings to execute service operation work, and only the host module that does not fail needs to take over the service operation work of the failed host module, so as to achieve the purpose of reducing operation and maintenance costs and meeting high-availability operation requirements in a failure scenario. The specific implementation is illustrated by the following examples.
As shown in fig. 1, a schematic structural diagram of a high availability host system according to an embodiment of the present invention is disclosed, the high availability host system includes a first host module 1 and a second host module 2 disposed in different buildings, the first host module 1 and the second host module 2 are respectively disposed in two buildings with a distance X, the value of X is less than or equal to 150 meters, the first host module 1 is disposed in a first building 3, and the second host module is disposed in a second building 4.
The connection relationship between the first host module 1 and the second host module 2 and the data interaction process between them are as follows:
the first host module 1 adopts a preset deployment mode, and is in data interconnection with the second host module 2 through a physical corridor and physical optical fibers arranged in a vertical shaft, and the first host module 1 and the second host module 2 form a Parallel coupling body (parallelSYPLEX).
The preset deployment modes are a corridor top deployment mode, a corridor bottom deployment mode and a vertical shaft deployment mode.
If any one of the first host module 1 and the second host module 2 fails, the host module that does not fail takes over the service operation work of the failed host module.
The first host module 1 includes a first coupler 11, a first physical host 12, a third physical host 13, a K2 disk, a master clock server 14, and a disk storage production master 15, wherein the PTS is the master clock server 14.
The connection relationship of the first coupler 11, the first physical host 12, the third physical host 13, the K2 disk, the master clock server 14, and the disk storage production master 15 is as follows:
in the first host module 1, the first coupler 11 is data-connected to the first physical host 12 and the third physical host 13, respectively.
The first physical host 12 is connected to the K2 disk, a disk mirror image thermal switching control system GDPS K2 is disposed in the first physical host 12, and the GDPS K2 is connected to the K2 disk through data.
The z/OS operating system, DB2 database, and middleware CICS need to be installed on the logical partition of the first physical host 12.
The z/OS operating system, DB2 database and middleware CICS need to be installed on the logical partition of the third physical host 13.
The master clock server 14 is provided in the first coupler 11.
The disk storage production master 15 is connected to the first physical host 12, the second physical host 22, the third physical host 13, and the fourth physical host 23, respectively.
The second host module 2 includes a second coupler 21, a second physical host 22, a fourth physical host 23, a K1 disk, a standby clock server 24, an arbitration clock server 25, and a disk storage production slave disk 26, wherein the BTS is the standby clock server 24.
Wherein, to increase the availability of the clock server required by the parallel coupling, the master clock server 14 is arranged at the first coupler 11 in the first building 3, the standby clock server is arranged at the second physical host 22 in the second building 4, and the arbitration clock server 25 is arranged at the fourth physical host 23; the communication between the inter-building clock servers and the physical connection of the inter-building host modules are realized by adopting the same mechanism.
The second physical host 22 is connected with the K1 disk, and a disk mirror image hot-switching control system GDPS K1 is arranged in the second physical host 22, and the GDPS K1 is connected with the K1 disk through data.
The logical partition of the second physical host 22 needs to have installed thereon the z/OS operating system, the DB2 database, and the middleware CICS.
The standby clock server 24 is provided in the second coupler 21.
The arbitration clock server 25 is provided in the fourth physical host 23.
The connection relationship of the second coupler 21, the second physical host 22, the fourth physical host 23, the standby clock server 24, the arbitration clock server 25, and the disk storage production slave disk 26 is as follows:
in the second host module 2, the second coupler 21 is data-connected to the second physical host 22 and the fourth physical host 23, respectively.
The z/OS operating system, DB2 database, and middleware CICS need to be installed on the logical partition of the fourth physical host 23.
The disk storage production slave disk 26 is connected to the first physical host 12, the second physical host 22, the third physical host 13, and the fourth physical host 23, respectively.
The first couplers 11 and the second couplers 21 which are mutually provided are respectively arranged in different host modules of two buildings, then STRUCTUREs (strucure) used for data interaction in the cross-logic partitions of the parallel coupling body are uniformly distributed and placed in the two couplers of the cross-building according to loads and functions, and the problem that physical hardware of the parallel coupling body is not available in a building-level fault scene is solved.
The disk storage production master 15 and the disk storage production slave 26 are deployed in two different buildings respectively, preventing the unavailability of all disk storage devices when building level failures occur. The I/O and data access services are normally provided externally by the disk storage production master 15, which is in a live mirrored state with the disk storage production slave 26. When the disk storage production master 15 fails or the building where the disk storage production master 15 is located fails, the disk mirror image hot switching system completes automatic switching between the disk storage production master 15 and the disk storage production slave 26, and the disk storage production slave 26 continues to provide data access service to the outside.
The connection relationship and the data interaction process between each device in the first host module 1 and each device in the second host module 2 are as follows:
the first coupler 11 is connected to the second coupler 21, and if the first coupler 11 fails, the second coupler 21 takes over the service operation of the first coupler 11.
The first coupler 11 is connected to a first physical host 12, a second physical host 22, a third physical host 13, and a fourth physical host 23, respectively.
The second coupler is connected to the first physical host 12, the second physical host 22, the third physical host 13, and the fourth physical host 23, respectively.
When the type structures of the first coupler 11 and the second coupler 21 are both LOCK type structures, the database global LOCK DB2 LOCK1 and the database component DB2 SCA are set in the first coupler 11, and the operating system global LOCK (global resource serialization, GRS) is set in the second coupler 21.
When the type structures of the first coupler 11 and the second coupler 21 are both cache type structures, the first coupler 11 and the second coupler 21 set a plurality of data global cache pools (DB 2 GlobalBufferPool, DB GBP) such as a database global cache pool DB2 GBP1, a database global cache pool DB2 GBP2, and a database global cache pool DB2 GBP3 by means of asynchronous duplex (Asynchronize Duplex).
When the type structures of the first coupler 11 and the second coupler 21 are both list type structures, the first coupler 11 sets list type structures ixcsr 1, ixcsr 3, and ixcsr 5 in a symmetrical deployment manner, and the second coupler 21 sets list type structures ixcsr 2, ixcsr 4, and ixcsr 6 in a symmetrical deployment manner.
The disk storage production master 15 is data-interconnected with the first physical host 12, the second physical host 22, the third physical host 13, and the fourth physical host 23, respectively.
The disk storage production slave disk 26 is data-interconnected with the first physical host 12, the second physical host 22, the third physical host 13, and the fourth physical host 23, respectively.
The standby clock server 24 is connected to the arbitration clock server 25.
The master clock server 14 and the standby clock server 24 are in data interconnection, if the master clock server 14 fails, the arbitration clock server 25 sends a failure reminding instruction to the standby clock server 24, so that the standby clock server 24 takes over the service operation work of the master clock server 14.
The disk storage production master 15 is connected with the disk storage production slave 26 through a disk mirror image, and if the disk storage production master 15 fails, the disk storage production slave 26 takes over the service operation of the disk storage production master 15.
According to the scheme, on the premise that the purchase quantity of infrastructure hardware and software in the existing host module is not increased, the problem of availability of the host module in a fault scene is solved, the high availability of the system is further improved, and the operation and maintenance service level of the host platform IT infrastructure is improved under the minimum cost investment.
In the embodiment of the invention, when any one of the first host module and the second host module fails, a plurality of parallel coupling body environments are not required to be built in different buildings to execute service operation, and only the host module which does not fail is required to take over the service operation of the host module which fails, so that the purposes of reducing operation and maintenance cost and meeting high-availability operation requirements in a failure scene are realized.
In order to facilitate understanding that the first host module 1 adopts a preset deployment manner, a process of performing data interconnection with the second host module 2 through a physical corridor established in advance and a physical optical fiber arranged in a vertical shaft is described with reference to fig. 2:
the first host module 1 arranged in the first building 3 adopts a corridor top end deployment mode and a corridor bottom end deployment mode, and performs data interconnection with the second host module 2 arranged in the second building 4 through physical optical fibers arranged in a pre-established physical corridor.
The first host module 1 adopts a vertical shaft deployment mode, and performs data interconnection with the second host module 2 through physical optical fibers arranged in a pre-established vertical shaft.
The vertical shaft deployment mode can be an underground deployment mode for arranging physical optical fibers.
In the embodiment of the invention, in order to realize interconnection and intercommunication between hardware and software in different host modules in a building, a physical connection profile and a vertical shaft are configured between the buildings, physical optical fibers are paved in the physical connection profile and the vertical shaft to interconnect the building-crossing devices, in order to ensure high efficiency and reliability of communication between the building-crossing devices, the optical fiber devices in the building connection profile adopt a redundancy deployment scheme, optical fiber deployment is respectively carried out on ceilings and floors of the building connection profile, and meanwhile, third paths of optical fiber connection is paved in the vertical shaft between different buildings, so that when the physical connection profile fails, communication connection between the physical devices of the building is still ensured, and reliability of communication between the different host modules of the building is realized.
As shown in fig. 3, in the first host module 1 of the first building 3 and the second host module 2 of the second building 4, which are installed with hardware devices required for running the parallel coupling, a procedure for specifically running the hardware devices and logical partitions required for running the parallel coupling is as follows:
in fig. 3, each physical host (first physical host 12, second physical host 22, third physical host 13, and fourth physical host 23) is Logically Partitioned (LPAR), each logical partition becomes a minimum working unit that can be independently operated, and each logical partition installs different software according to the provided functions. For example, in order to implement business operation functions such as account inquiry and updating, which are common to banks, a z/OS operating system, a DB2 database, a middleware CICS, and the like need to be installed on each logical partition of each physical host.
The first physical host 12 is provided with a disk image hot-switching control system GDPS K2.
The second physical host 22 is provided with a disk image hot-swap control system GDPS K1.
The z/OS operating system, DB2 database, and middleware CICS are installed on the logical partitions of the first physical host 12.
The logical partition of the second physical host 22 has installed thereon a z/OS operating system, a DB2 database, and a middleware CICS.
The z/OS operating system, DB2 database, and middleware CICS are installed on the logical partitions of the third physical host 13.
The z/OS operating system, DB2 database, and middleware CICS are installed on the logical partitions of the fourth physical host 23.
And adjusting the operation key roles of the parallel coupling body and the configuration parameters of the operating system, wherein the operation key roles and the configuration parameters comprise a clock server setting mode, an operating system logical partition setting mode, a structural distribution mode in the coupling, the installation position of a disk mirror image hot switching control system, master-slave role setting of disk storage equipment and the like.
The role of the master clock server 14 (PTS) is set, the master clock server 14 is arranged in the first building 3, the standby clock server 24 (BTS) and the arbitration clock server 25 are arranged in the second building 4, and by the arrangement scheme, when the first building 3 fails, the standby clock server 24 and the arbitration clock server 25 in the second building 4 automatically take over the functions of the master clock server 14 by the standby clock server 24 after being negotiated together, so that the clock service of the parallel coupling body is ensured.
The master clock server 14 is disposed on the first coupler 11 of the first building 3, the standby clock server 24 is disposed on the second coupler 21 of the second building 4, and the arbitration clock server 25 is disposed on the fourth physical host 23 of the second building 4.
The hardware devices in the two buildings are deployed and installed in a peer-to-peer mutual backup mode, the number, the types and the supported functional requirements of the physical devices deployed in the host modules in the two buildings are kept consistent, and the device models are the same as much as possible, so that the problem that high-availability switching cannot be realized due to poor mutual backup compatibility possibly caused by slight difference in functions among different device models is solved.
If any one of the first host module 1 and the second host module 2 fails, the host module that does not fail takes over the service operation work of the failed host module.
In the embodiment of the invention, the host modules required by the high-availability host system are built into a plurality of host modules in a plurality of buildings crossing different physical positions from a single host module in the same building, and when one building or the host modules fail, the host modules in the other building can support the normal operation of the system, so that the problem that the service continuity requirement cannot be met under a building-level fault scene when the single building operates is solved.
In order to facilitate understanding of the above-described type structure by the first coupler 11 and the second coupler 21, a deployment distribution pattern in the first coupler 11 and the second coupler 21 is provided as shown in fig. 4.
In fig. 4, when the type structures of the first coupler 11 and the second coupler 21 are both LOCK type structures, the database global LOCK DB2 LOCK1 and the database component DB2 SCA are provided in the first coupler 11, and the global LOCK GRS is provided in the second coupler 21.
For the LOCK type structure used by database DB2, the global LOCK1 and SCA components should be deployed in the same coupler to meet DB 2's strong consistency requirements for data.
In order to solve the performance problems of LOCK1 and SCA in a cross-building access scene, an asynchronous duplex deployment mode is started, and the high-availability requirement is realized, wherein the specific process is as follows:
when the type structures of the first coupler 11 and the second coupler 21 are both cache type structures, the first coupler 11 and the second coupler 21 set a plurality of database global buffer pools such as a database global buffer pool DB2 GBP1, a database global buffer pool DB2 GBP2, a database global buffer pool DB2 GBP3, and the like through an asynchronous duplex mode.
The type structures of the first coupler 11 and the second coupler 21 are both cache type structures, and are deployed in parallel coupling bodies in different buildings in a duplex mode, so as to improve access efficiency and usability.
When the type structures of the first coupler 11 and the second coupler 21 are both list type structures, the first coupler 11 sets list type structures ixcsr 1, ixcsr 3, and ixcsr 5 in a symmetrical deployment manner, and the second coupler 21 sets list type structures ixcsr 2, ixcsr 4, and ixcsr 6 in a symmetrical deployment manner.
The type structures of the first coupler 11 and the second coupler 21 are list type structures, and the list type structures in the two couplers symmetrically deployed in different buildings are consistent in deployment number.
In the embodiment of the invention, the deployment distribution modes in the first coupler 11 and the second coupler 21 are set through different types of structures of the first coupler 11 and the second coupler 21 so as to improve the access efficiency and the high availability.
To facilitate an understanding of the above-described process in which the disk storage production master 15 and the disk storage production slave 26 are deployed in two different buildings, respectively, the above-described embodiment is described with reference to fig. 5 on the basis of fig. 1.
In fig. 5, the disk storage production host 15 is data-interconnected with the first physical host 12, the second physical host 22, the third physical host 13, and the fourth physical host 23, respectively.
The disk storage production slave disk 26 is data-interconnected with the first physical host 12, the second physical host 22, the third physical host 13, and the fourth physical host 23, respectively.
The first physical host 12 is provided with a GDPS K2 system, and the GDPS K2 system is expected to be a GDPS slave control system (alternate master controlling system).
The K2 disk is connected with the GDPS K2 system data.
The second physical host 22 is provided with a GDPS K1 system, and the GDPS K1 system is expected to be a GDPS main control system (master controlling system).
The K1 disk is connected with the GDPS K1 system data.
The operating system disk volumes of the GDPS K1 system and the GDPS K2 system adopt independent disk storage, the operating system disk volumes of the GDPS K1 system and the GDPS K2 system are separated from the disk bodies of the disk storage production master disk 15 and the disk storage production slave disk 26, the GDPS K1 system and the GDPS K2 system are decoupled from the production disk, and the high availability of the GDPS K1 system and the GDPS K2 system when the production master disk or the production slave disk fails is improved.
If the optical connection between the partition where the GDPS K1 system and the GDPS K2 system are located and the production system logical partition and the disk storage is not distinguishable from the disk body of the disk storage production master disk 15 or the disk storage production slave disk 26, the optical connection between the partition and the production system logical partition and the disk storage is separated, and the optical connection is connected to the disk cabinet by using different physical connection channels, so that the separation is realized at the optical fiber physical connection level.
The disk storage production master 15 is connected with the disk storage production slave 26 through a disk mirror image, and if the disk storage production master 15 fails, the disk storage production slave 26 takes over the service operation of the disk storage production master.
In the embodiment of the invention, the disk storage production master disk 15 and the disk storage production slave disk 26 are respectively deployed in two different buildings, so that the unavailability of all disk storage devices is prevented when building level faults (such as building collapse and the like) occur. When the disk storage production master 15 fails or the building where the disk storage production master 15 is located fails, the disk mirror image hot switching system completes automatic switching between the disk storage production master 15 and the disk storage production slave 26, and the disk storage production slave 26 continues to provide data access service to the outside.
It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
The foregoing is merely a preferred embodiment of the present invention and it should be noted that modifications and adaptations to those skilled in the art may be made without departing from the principles of the present invention, which are intended to be comprehended within the scope of the present invention.
Claims (9)
1. A high availability host system, the system comprising a first host module and a second host module disposed in different buildings;
the first host module adopts a preset deployment mode, and is in data interconnection with the second host module through a physical corridor and physical optical fibers arranged in a vertical shaft, the first host module and the second host module form a parallel coupling body, the first host module and the second host module are respectively deployed in two buildings with X distances, and the value of X is less than or equal to 150 meters;
if any one of the first host module and the second host module fails, the host module which does not fail takes over the service operation work of the failed host module;
the first host module adopts a preset deployment mode, and performs data interconnection with the second host module through a physical corridor established in advance and a physical optical fiber arranged in a vertical shaft, and the method comprises the following steps:
the first host module adopts a corridor top end deployment mode and a corridor bottom end deployment mode, and performs data interconnection with the second host module through physical optical fibers arranged in a pre-established physical corridor;
and the first host module adopts a vertical shaft deployment mode and is in data interconnection with the second host module through a physical optical fiber arranged in a pre-established vertical shaft.
2. The system of claim 1, wherein the first host module comprises a first coupler, a first physical host, a third physical host, a K2 disk, a master clock server, and a disk storage production master;
the master clock server is arranged on the first coupler;
the first coupler is respectively connected with the first physical host and the third physical host;
the K2 disk is connected with the first physical host;
the disk storage production master is respectively connected with the first physical host and the third physical host.
3. The system of claim 2, wherein the second host module comprises a second coupler, a second physical host, a fourth physical host, a K1 disk, a standby clock server, an arbitrated clock server, and a disk storage production slave;
the standby clock server is arranged on the second coupler;
the arbitration clock server is arranged on the fourth physical host;
the second coupler is respectively connected with the second physical host and the fourth physical host;
the K1 magnetic disk is connected with the second physical host;
the disk storage production slave disk is connected to the second physical host and the fourth physical host, respectively.
4. A system according to claim 3, wherein the first coupler is connected to the second coupler, and the second coupler takes over service operation of the first coupler if the first coupler fails;
the first coupler is respectively connected with the first physical host, the second physical host, the third physical host and the fourth physical host;
the second coupler is respectively connected with the first physical host, the second physical host, the third physical host and the fourth physical host;
the disk storage production master is respectively connected with the first physical host, the second physical host, the third physical host and the fourth physical host;
the disk storage production slave disk is respectively connected with the first physical host, the second physical host, the third physical host and the fourth physical host;
the master clock server and the standby clock server are subjected to data interconnection, and if the master clock server fails, the arbitration clock server sends a failure reminding instruction to the standby clock server so that the standby clock server takes over the service operation work of the master clock server;
and the disk storage production master disk is connected with the disk storage production slave disk through a disk mirror image, and if the disk storage production master disk fails, the disk storage production slave disk takes over the service operation work of the disk storage production master disk.
5. The system of claim 4, wherein the type structures of the first coupler and the second coupler are LOCK type structures, wherein a database global LOCK DB2 LOCK1 and a database component DB2 SCA are provided in the first coupler, and wherein a global LOCK GRS is provided in the second coupler.
6. The system of claim 4, wherein the type structures of the first coupler and the second coupler are cache type structures, and the first coupler and the second coupler set up a plurality of global buffer pools of the database in an asynchronous duplex mode.
7. The system of claim 4, wherein the type structures of the first coupler and the second coupler are list type structures, wherein the first coupler is configured to set list type structure IXCSTR1, list type structure IXCSTR3, and list type structure IXCSTR5 in a symmetrical deployment manner, and wherein the second coupler is configured to set list type structure IXCSTR2, list type structure IXCSTR4, and list type structure IXCSTR6 in a symmetrical deployment manner.
8. The system of claim 4, wherein a disk mirror hot-swap control system GDPS K2 is provided in the first physical host.
9. The system of claim 4, wherein a disk mirror hot-swap control system GDPS K1 is provided in the second physical host.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110591338.9A CN113220519B (en) | 2021-05-28 | 2021-05-28 | High-availability host system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110591338.9A CN113220519B (en) | 2021-05-28 | 2021-05-28 | High-availability host system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113220519A CN113220519A (en) | 2021-08-06 |
CN113220519B true CN113220519B (en) | 2024-04-16 |
Family
ID=77099068
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110591338.9A Active CN113220519B (en) | 2021-05-28 | 2021-05-28 | High-availability host system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113220519B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115378464B (en) * | 2022-08-12 | 2023-08-15 | 江苏德是和通信科技有限公司 | Main and standby machine synthesis switching system of transmitter |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH0869586A (en) * | 1994-08-30 | 1996-03-12 | Nippon Steel Corp | Building automation system |
JPH10190716A (en) * | 1996-12-24 | 1998-07-21 | Matsushita Electric Works Ltd | Distributed integrated wiring system |
KR20000056861A (en) * | 1999-02-26 | 2000-09-15 | 문성주 | Building automation system |
EP2077500A1 (en) * | 2007-12-28 | 2009-07-08 | Bull S.A.S. | High-availability computer system |
CN102325192A (en) * | 2011-09-30 | 2012-01-18 | 上海宝信软件股份有限公司 | Cloud computing implementation method and system |
CN107592159A (en) * | 2017-09-29 | 2018-01-16 | 深圳达实智能股份有限公司 | A kind of intelligent building optical network system and optical network apparatus |
CN108351823A (en) * | 2015-10-22 | 2018-07-31 | Netapp股份有限公司 | It realizes and automatically switches |
CN108416957A (en) * | 2018-03-20 | 2018-08-17 | 安徽理工大学 | A building security monitoring system |
CN111478450A (en) * | 2020-06-24 | 2020-07-31 | 南京长江都市建筑设计股份有限公司 | Building electrical comprehensive monitoring protection device |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10505756B2 (en) * | 2017-02-10 | 2019-12-10 | Johnson Controls Technology Company | Building management system with space graphs |
-
2021
- 2021-05-28 CN CN202110591338.9A patent/CN113220519B/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH0869586A (en) * | 1994-08-30 | 1996-03-12 | Nippon Steel Corp | Building automation system |
JPH10190716A (en) * | 1996-12-24 | 1998-07-21 | Matsushita Electric Works Ltd | Distributed integrated wiring system |
KR20000056861A (en) * | 1999-02-26 | 2000-09-15 | 문성주 | Building automation system |
EP2077500A1 (en) * | 2007-12-28 | 2009-07-08 | Bull S.A.S. | High-availability computer system |
CN102325192A (en) * | 2011-09-30 | 2012-01-18 | 上海宝信软件股份有限公司 | Cloud computing implementation method and system |
CN108351823A (en) * | 2015-10-22 | 2018-07-31 | Netapp股份有限公司 | It realizes and automatically switches |
CN107592159A (en) * | 2017-09-29 | 2018-01-16 | 深圳达实智能股份有限公司 | A kind of intelligent building optical network system and optical network apparatus |
CN108416957A (en) * | 2018-03-20 | 2018-08-17 | 安徽理工大学 | A building security monitoring system |
CN111478450A (en) * | 2020-06-24 | 2020-07-31 | 南京长江都市建筑设计股份有限公司 | Building electrical comprehensive monitoring protection device |
Also Published As
Publication number | Publication date |
---|---|
CN113220519A (en) | 2021-08-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP5102901B2 (en) | Method and system for maintaining data integrity between multiple data servers across a data center | |
CN110083662B (en) | Double-living framework construction method based on platform system | |
CN100470494C (en) | Cluster availability management method and system | |
CN110912991A (en) | Super-fusion-based high-availability implementation method for double nodes | |
CN109446169B (en) | Double-control disk array shared file system | |
US20030046602A1 (en) | Data storage system | |
US20120079090A1 (en) | Stateful subnet manager failover in a middleware machine environment | |
JP2018500648A (en) | Smooth controller change in redundant configuration between clusters | |
US20060179218A1 (en) | Method, apparatus and program storage device for providing geographically isolated failover using instant RAID swapping in mirrored virtual disks | |
CN107733684A (en) | A kind of multi-controller computing redundancy cluster based on Loongson processor | |
CN109474465A (en) | A kind of method and system of the high availability that can dynamically circulate based on server cluster | |
US20110078396A1 (en) | Remote copy control method and system in storage cluster environment | |
CN113220519B (en) | High-availability host system | |
CN110719282B (en) | A dual-active authentication system based on unified authority | |
CN106850269A (en) | A kind of management system of cloud platform | |
CN201584980U (en) | Data centre centralized control switching system | |
US20200387575A1 (en) | Migrating virtual machines using asynchronous transfer and synchronous acceleration | |
CN113626252A (en) | City-level disaster recovery method and device based on cluster, electronic equipment and medium | |
CN113342261A (en) | Server and control method applied to same | |
CN112783694A (en) | Long-distance disaster recovery method for high-availability Redis | |
US7752493B2 (en) | High reliability system, redundant construction control method, and program | |
CN112685234B (en) | Financial grade two-place three-center high-availability MySQL database implementation method | |
CN114706714A (en) | Method for synchronizing computer memory division snapshots | |
CN114416501A (en) | Storage double-activity and test system and method | |
TWI669605B (en) | Fault tolerance method and system for virtual machine group |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |