CN113220519B - High-availability host system - Google Patents

High-availability host system Download PDF

Info

Publication number
CN113220519B
CN113220519B CN202110591338.9A CN202110591338A CN113220519B CN 113220519 B CN113220519 B CN 113220519B CN 202110591338 A CN202110591338 A CN 202110591338A CN 113220519 B CN113220519 B CN 113220519B
Authority
CN
China
Prior art keywords
host
coupler
physical
physical host
host module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110591338.9A
Other languages
Chinese (zh)
Other versions
CN113220519A (en
Inventor
曹杰瑞
滕腾
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Construction Bank Corp
Original Assignee
China Construction Bank Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Construction Bank Corp filed Critical China Construction Bank Corp
Priority to CN202110591338.9A priority Critical patent/CN113220519B/en
Publication of CN113220519A publication Critical patent/CN113220519A/en
Application granted granted Critical
Publication of CN113220519B publication Critical patent/CN113220519B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/22Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing
    • G06F11/26Functional testing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/04Generating or distributing clock signals or signals derived directly therefrom
    • G06F1/06Clock generators producing several clock signals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/38Information transfer, e.g. on bus
    • G06F13/40Bus structure
    • G06F13/4004Coupling between buses
    • G06F13/4027Coupling between buses using bus bridges
    • G06F13/4031Coupling between buses using bus bridges with arbitration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/20Administration of product repair or maintenance

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Quality & Reliability (AREA)
  • Computer Hardware Design (AREA)
  • Databases & Information Systems (AREA)
  • Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Data Mining & Analysis (AREA)
  • Hardware Redundancy (AREA)

Abstract

The invention discloses a high-availability host system, which comprises a first host module and a second host module which are arranged in different buildings, wherein the first host module adopts a preset deployment mode and is communicated and interconnected with the second host module through a physical corridor and a physical optical fiber arranged in a vertical shaft which are established in advance, the first host module and the second host module form a set of parallel coupling bodies, and if any host module in the first host module and the second host module fails, the host module which does not fail takes over the service operation work of the host module which fails. Through the scheme, when any one of the first host module and the second host module fails, a plurality of parallel coupling body environments are not required to be built in different buildings to execute business operation work, and only the host module which does not fail is required to take over the business operation work of the host module which fails, so that the operation and maintenance cost is reduced, and the purpose of meeting high-availability operation requirements in a failure scene is achieved.

Description

High-availability host system
Technical Field
The present invention relates to the field of international business machine (International Business Machines, IBM) Z mainframe deployment technology, and more particularly to a high availability host system.
Background
In recent years, with the rapid development of internet technology, the application of high availability technology in various fields is becoming wider and wider, the size of high availability clusters is also becoming larger, and the number of hosts of the high availability clusters is increasing from several to tens or even hundreds.
In order to achieve service continuity, conventional hosts generally employ virtualization technologies, parallel computing and load balancing technologies, and the like to perform high-availability deployment. In a specific high availability deployment scheme, hardware devices of all host platforms are deployed in the same host module, or hardware devices of all host platforms are deployed in host modules which are adjacent to each other in the same building room. However, the deployment scheme cannot cope with the situation that the fault scene such as power interruption of the whole building, collapse of the building and the like cannot be repaired in a short period, so that the high-availability operation requirement cannot be met. Although the continuity requirement of the service can be supported by setting up a plurality of parallel coupling environments in a plurality of buildings, the operation and maintenance cost is high due to the purchase of software and hardware of a host platform.
Therefore, the existing high-availability host deployment scheme cannot meet the high-availability operation requirement in the fault scenario, and the operation and maintenance cost is high.
Disclosure of Invention
The embodiment of the invention discloses a high-availability host system, which realizes the purposes of reducing operation and maintenance cost and meeting high-availability operation requirements in a fault scene.
In order to achieve the above purpose, the technical scheme disclosed by the method is as follows:
the first aspect of the present invention discloses a high availability host system comprising a first host module and a second host module disposed in different buildings;
the first host module adopts a preset deployment mode, and is in data interconnection with the second host module through a physical corridor and physical optical fibers arranged in a vertical shaft, the first host module and the second host module form a parallel coupling body, the first host module and the second host module are respectively deployed in two buildings with X distances, and the value of X is less than or equal to 150 meters;
if any one of the first host module and the second host module fails, the host module which does not fail takes over the service operation work of the failed host module.
Preferably, the first host module adopts a preset deployment mode, and performs data interconnection with the second host module through a physical corridor established in advance and a physical optical fiber arranged in a vertical shaft, and the method includes:
the first host module adopts a corridor top end deployment mode and a corridor bottom end deployment mode, and performs data interconnection with the second host module through physical optical fibers arranged in a pre-established physical corridor;
and the first host module adopts a vertical shaft deployment mode and is in data interconnection with the second host module through a physical optical fiber arranged in a pre-established vertical shaft.
Preferably, the first host module comprises a first coupler, a first physical host, a third physical host, a K2 disk, a master clock server and a disk storage production master;
the master clock server is arranged on the first coupler;
the first coupler is respectively connected with the first physical host and the third physical host;
the K2 disk is connected with the first physical host;
the disk storage production master is respectively connected with the first physical host and the third physical host.
Preferably, the second host module comprises a second coupler, a second physical host, a fourth physical host, a K1 disk, a standby clock server, an arbitration clock server and a disk storage production slave disk;
the standby clock server is arranged on the second coupler;
the arbitration clock server is arranged on the fourth physical host;
the second coupler is respectively connected with the second physical host and the fourth physical host;
the K1 magnetic disk is connected with the second physical host;
the disk storage production slave disk is connected to the second physical host and the fourth physical host, respectively.
Preferably, the first coupler is connected with the second coupler, and if the first coupler fails, the second coupler takes over the service operation work of the first coupler;
the first coupler is respectively connected with the first physical host, the second physical host, the third physical host and the fourth physical host;
the second coupler is respectively connected with the first physical host, the second physical host, the third physical host and the fourth physical host;
the disk storage production master is respectively connected with the first physical host, the second physical host, the third physical host and the fourth physical host;
the disk storage production slave disk is respectively connected with the first physical host, the second physical host, the third physical host and the fourth physical host;
the master clock server and the standby clock server are subjected to data interconnection, and if the master clock server fails, the arbitration clock server sends a failure reminding instruction to the standby clock server so that the standby clock server takes over the service operation work of the master clock server;
and the disk storage production master disk is connected with the disk storage production slave disk through a disk mirror image, and if the disk storage production master disk fails, the disk storage production slave disk takes over the service operation work of the disk storage production master disk.
Preferably, the type structures of the first coupler and the second coupler are LOCK type structures, a database global LOCK DB2 LOCK1 and a database component DB2 SCA are arranged in the first coupler, and a global LOCK GRS is arranged in the second coupler.
Preferably, the type structures of the first coupler and the second coupler are cache type structures, and the first coupler and the second coupler set up a plurality of database global buffer pools in an asynchronous duplex mode.
Preferably, the type structures of the first coupler and the second coupler are list type structures, the first coupler sets list type structures ixcsr 1, ixcsr 3 and ixcsr 5 in a symmetrical deployment mode, and the second coupler sets list type structures ixcsr 2, ixcsr 4 and ixcsr 6 in a symmetrical deployment mode.
Preferably, a disk mirror hot-switching control system GDPS K2 is disposed in the first physical host.
Preferably, a disk mirror hot-switching control system GDPS K1 is disposed in the second physical host.
According to the technical scheme, the system comprises the first host module and the second host module which are arranged in different buildings, the first host module adopts a preset deployment mode and is communicated and interconnected with the second host module through the physical corridor and the physical optical fibers arranged in the vertical shaft which are established in advance, the first host module and the second host module form a set of parallel coupling body, and if any host module in the first host module and the second host module fails, the host module which does not fail takes over the service operation work of the host module which fails. Through the scheme, when any one of the first host module and the second host module fails, a plurality of parallel coupling body environments are not required to be built in different buildings to execute business operation work, and only the host module which does not fail is required to take over the business operation work of the host module which fails, so that the operation and maintenance cost is reduced, and the purpose of meeting high-availability operation requirements in a failure scene is achieved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present invention, and that other drawings can be obtained according to the provided drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of a high availability host system according to an embodiment of the present invention;
FIG. 2 is a schematic deployment diagram of a first host module and a second host module between buildings according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of another high availability host system according to an embodiment of the present invention;
FIG. 4 is a schematic deployment diagram of a first coupler and a second coupler according to the type structure of the couplers disclosed in the embodiment of the invention;
FIG. 5 is a schematic diagram of another embodiment of a high availability host system.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
In this application, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
As known from the background art, the existing high-availability host deployment scheme cannot meet the high-availability operation requirement in fault scenes such as fault scenes which cannot be repaired in a short period, such as power interruption of the whole building, building collapse and the like, and has high operation and maintenance cost.
In order to achieve the above-mentioned purpose, the embodiment of the present invention discloses a high-availability host system, when any one of a first host module and a second host module fails, it is unnecessary to build multiple sets of parallel coupling environments in different buildings to execute service operation work, and only the host module that does not fail needs to take over the service operation work of the failed host module, so as to achieve the purpose of reducing operation and maintenance costs and meeting high-availability operation requirements in a failure scenario. The specific implementation is illustrated by the following examples.
As shown in fig. 1, a schematic structural diagram of a high availability host system according to an embodiment of the present invention is disclosed, the high availability host system includes a first host module 1 and a second host module 2 disposed in different buildings, the first host module 1 and the second host module 2 are respectively disposed in two buildings with a distance X, the value of X is less than or equal to 150 meters, the first host module 1 is disposed in a first building 3, and the second host module is disposed in a second building 4.
The connection relationship between the first host module 1 and the second host module 2 and the data interaction process between them are as follows:
the first host module 1 adopts a preset deployment mode, and is in data interconnection with the second host module 2 through a physical corridor and physical optical fibers arranged in a vertical shaft, and the first host module 1 and the second host module 2 form a Parallel coupling body (parallelSYPLEX).
The preset deployment modes are a corridor top deployment mode, a corridor bottom deployment mode and a vertical shaft deployment mode.
If any one of the first host module 1 and the second host module 2 fails, the host module that does not fail takes over the service operation work of the failed host module.
The first host module 1 includes a first coupler 11, a first physical host 12, a third physical host 13, a K2 disk, a master clock server 14, and a disk storage production master 15, wherein the PTS is the master clock server 14.
The connection relationship of the first coupler 11, the first physical host 12, the third physical host 13, the K2 disk, the master clock server 14, and the disk storage production master 15 is as follows:
in the first host module 1, the first coupler 11 is data-connected to the first physical host 12 and the third physical host 13, respectively.
The first physical host 12 is connected to the K2 disk, a disk mirror image thermal switching control system GDPS K2 is disposed in the first physical host 12, and the GDPS K2 is connected to the K2 disk through data.
The z/OS operating system, DB2 database, and middleware CICS need to be installed on the logical partition of the first physical host 12.
The z/OS operating system, DB2 database and middleware CICS need to be installed on the logical partition of the third physical host 13.
The master clock server 14 is provided in the first coupler 11.
The disk storage production master 15 is connected to the first physical host 12, the second physical host 22, the third physical host 13, and the fourth physical host 23, respectively.
The second host module 2 includes a second coupler 21, a second physical host 22, a fourth physical host 23, a K1 disk, a standby clock server 24, an arbitration clock server 25, and a disk storage production slave disk 26, wherein the BTS is the standby clock server 24.
Wherein, to increase the availability of the clock server required by the parallel coupling, the master clock server 14 is arranged at the first coupler 11 in the first building 3, the standby clock server is arranged at the second physical host 22 in the second building 4, and the arbitration clock server 25 is arranged at the fourth physical host 23; the communication between the inter-building clock servers and the physical connection of the inter-building host modules are realized by adopting the same mechanism.
The second physical host 22 is connected with the K1 disk, and a disk mirror image hot-switching control system GDPS K1 is arranged in the second physical host 22, and the GDPS K1 is connected with the K1 disk through data.
The logical partition of the second physical host 22 needs to have installed thereon the z/OS operating system, the DB2 database, and the middleware CICS.
The standby clock server 24 is provided in the second coupler 21.
The arbitration clock server 25 is provided in the fourth physical host 23.
The connection relationship of the second coupler 21, the second physical host 22, the fourth physical host 23, the standby clock server 24, the arbitration clock server 25, and the disk storage production slave disk 26 is as follows:
in the second host module 2, the second coupler 21 is data-connected to the second physical host 22 and the fourth physical host 23, respectively.
The z/OS operating system, DB2 database, and middleware CICS need to be installed on the logical partition of the fourth physical host 23.
The disk storage production slave disk 26 is connected to the first physical host 12, the second physical host 22, the third physical host 13, and the fourth physical host 23, respectively.
The first couplers 11 and the second couplers 21 which are mutually provided are respectively arranged in different host modules of two buildings, then STRUCTUREs (strucure) used for data interaction in the cross-logic partitions of the parallel coupling body are uniformly distributed and placed in the two couplers of the cross-building according to loads and functions, and the problem that physical hardware of the parallel coupling body is not available in a building-level fault scene is solved.
The disk storage production master 15 and the disk storage production slave 26 are deployed in two different buildings respectively, preventing the unavailability of all disk storage devices when building level failures occur. The I/O and data access services are normally provided externally by the disk storage production master 15, which is in a live mirrored state with the disk storage production slave 26. When the disk storage production master 15 fails or the building where the disk storage production master 15 is located fails, the disk mirror image hot switching system completes automatic switching between the disk storage production master 15 and the disk storage production slave 26, and the disk storage production slave 26 continues to provide data access service to the outside.
The connection relationship and the data interaction process between each device in the first host module 1 and each device in the second host module 2 are as follows:
the first coupler 11 is connected to the second coupler 21, and if the first coupler 11 fails, the second coupler 21 takes over the service operation of the first coupler 11.
The first coupler 11 is connected to a first physical host 12, a second physical host 22, a third physical host 13, and a fourth physical host 23, respectively.
The second coupler is connected to the first physical host 12, the second physical host 22, the third physical host 13, and the fourth physical host 23, respectively.
When the type structures of the first coupler 11 and the second coupler 21 are both LOCK type structures, the database global LOCK DB2 LOCK1 and the database component DB2 SCA are set in the first coupler 11, and the operating system global LOCK (global resource serialization, GRS) is set in the second coupler 21.
When the type structures of the first coupler 11 and the second coupler 21 are both cache type structures, the first coupler 11 and the second coupler 21 set a plurality of data global cache pools (DB 2 GlobalBufferPool, DB GBP) such as a database global cache pool DB2 GBP1, a database global cache pool DB2 GBP2, and a database global cache pool DB2 GBP3 by means of asynchronous duplex (Asynchronize Duplex).
When the type structures of the first coupler 11 and the second coupler 21 are both list type structures, the first coupler 11 sets list type structures ixcsr 1, ixcsr 3, and ixcsr 5 in a symmetrical deployment manner, and the second coupler 21 sets list type structures ixcsr 2, ixcsr 4, and ixcsr 6 in a symmetrical deployment manner.
The disk storage production master 15 is data-interconnected with the first physical host 12, the second physical host 22, the third physical host 13, and the fourth physical host 23, respectively.
The disk storage production slave disk 26 is data-interconnected with the first physical host 12, the second physical host 22, the third physical host 13, and the fourth physical host 23, respectively.
The standby clock server 24 is connected to the arbitration clock server 25.
The master clock server 14 and the standby clock server 24 are in data interconnection, if the master clock server 14 fails, the arbitration clock server 25 sends a failure reminding instruction to the standby clock server 24, so that the standby clock server 24 takes over the service operation work of the master clock server 14.
The disk storage production master 15 is connected with the disk storage production slave 26 through a disk mirror image, and if the disk storage production master 15 fails, the disk storage production slave 26 takes over the service operation of the disk storage production master 15.
According to the scheme, on the premise that the purchase quantity of infrastructure hardware and software in the existing host module is not increased, the problem of availability of the host module in a fault scene is solved, the high availability of the system is further improved, and the operation and maintenance service level of the host platform IT infrastructure is improved under the minimum cost investment.
In the embodiment of the invention, when any one of the first host module and the second host module fails, a plurality of parallel coupling body environments are not required to be built in different buildings to execute service operation, and only the host module which does not fail is required to take over the service operation of the host module which fails, so that the purposes of reducing operation and maintenance cost and meeting high-availability operation requirements in a failure scene are realized.
In order to facilitate understanding that the first host module 1 adopts a preset deployment manner, a process of performing data interconnection with the second host module 2 through a physical corridor established in advance and a physical optical fiber arranged in a vertical shaft is described with reference to fig. 2:
the first host module 1 arranged in the first building 3 adopts a corridor top end deployment mode and a corridor bottom end deployment mode, and performs data interconnection with the second host module 2 arranged in the second building 4 through physical optical fibers arranged in a pre-established physical corridor.
The first host module 1 adopts a vertical shaft deployment mode, and performs data interconnection with the second host module 2 through physical optical fibers arranged in a pre-established vertical shaft.
The vertical shaft deployment mode can be an underground deployment mode for arranging physical optical fibers.
In the embodiment of the invention, in order to realize interconnection and intercommunication between hardware and software in different host modules in a building, a physical connection profile and a vertical shaft are configured between the buildings, physical optical fibers are paved in the physical connection profile and the vertical shaft to interconnect the building-crossing devices, in order to ensure high efficiency and reliability of communication between the building-crossing devices, the optical fiber devices in the building connection profile adopt a redundancy deployment scheme, optical fiber deployment is respectively carried out on ceilings and floors of the building connection profile, and meanwhile, third paths of optical fiber connection is paved in the vertical shaft between different buildings, so that when the physical connection profile fails, communication connection between the physical devices of the building is still ensured, and reliability of communication between the different host modules of the building is realized.
As shown in fig. 3, in the first host module 1 of the first building 3 and the second host module 2 of the second building 4, which are installed with hardware devices required for running the parallel coupling, a procedure for specifically running the hardware devices and logical partitions required for running the parallel coupling is as follows:
in fig. 3, each physical host (first physical host 12, second physical host 22, third physical host 13, and fourth physical host 23) is Logically Partitioned (LPAR), each logical partition becomes a minimum working unit that can be independently operated, and each logical partition installs different software according to the provided functions. For example, in order to implement business operation functions such as account inquiry and updating, which are common to banks, a z/OS operating system, a DB2 database, a middleware CICS, and the like need to be installed on each logical partition of each physical host.
The first physical host 12 is provided with a disk image hot-switching control system GDPS K2.
The second physical host 22 is provided with a disk image hot-swap control system GDPS K1.
The z/OS operating system, DB2 database, and middleware CICS are installed on the logical partitions of the first physical host 12.
The logical partition of the second physical host 22 has installed thereon a z/OS operating system, a DB2 database, and a middleware CICS.
The z/OS operating system, DB2 database, and middleware CICS are installed on the logical partitions of the third physical host 13.
The z/OS operating system, DB2 database, and middleware CICS are installed on the logical partitions of the fourth physical host 23.
And adjusting the operation key roles of the parallel coupling body and the configuration parameters of the operating system, wherein the operation key roles and the configuration parameters comprise a clock server setting mode, an operating system logical partition setting mode, a structural distribution mode in the coupling, the installation position of a disk mirror image hot switching control system, master-slave role setting of disk storage equipment and the like.
The role of the master clock server 14 (PTS) is set, the master clock server 14 is arranged in the first building 3, the standby clock server 24 (BTS) and the arbitration clock server 25 are arranged in the second building 4, and by the arrangement scheme, when the first building 3 fails, the standby clock server 24 and the arbitration clock server 25 in the second building 4 automatically take over the functions of the master clock server 14 by the standby clock server 24 after being negotiated together, so that the clock service of the parallel coupling body is ensured.
The master clock server 14 is disposed on the first coupler 11 of the first building 3, the standby clock server 24 is disposed on the second coupler 21 of the second building 4, and the arbitration clock server 25 is disposed on the fourth physical host 23 of the second building 4.
The hardware devices in the two buildings are deployed and installed in a peer-to-peer mutual backup mode, the number, the types and the supported functional requirements of the physical devices deployed in the host modules in the two buildings are kept consistent, and the device models are the same as much as possible, so that the problem that high-availability switching cannot be realized due to poor mutual backup compatibility possibly caused by slight difference in functions among different device models is solved.
If any one of the first host module 1 and the second host module 2 fails, the host module that does not fail takes over the service operation work of the failed host module.
In the embodiment of the invention, the host modules required by the high-availability host system are built into a plurality of host modules in a plurality of buildings crossing different physical positions from a single host module in the same building, and when one building or the host modules fail, the host modules in the other building can support the normal operation of the system, so that the problem that the service continuity requirement cannot be met under a building-level fault scene when the single building operates is solved.
In order to facilitate understanding of the above-described type structure by the first coupler 11 and the second coupler 21, a deployment distribution pattern in the first coupler 11 and the second coupler 21 is provided as shown in fig. 4.
In fig. 4, when the type structures of the first coupler 11 and the second coupler 21 are both LOCK type structures, the database global LOCK DB2 LOCK1 and the database component DB2 SCA are provided in the first coupler 11, and the global LOCK GRS is provided in the second coupler 21.
For the LOCK type structure used by database DB2, the global LOCK1 and SCA components should be deployed in the same coupler to meet DB 2's strong consistency requirements for data.
In order to solve the performance problems of LOCK1 and SCA in a cross-building access scene, an asynchronous duplex deployment mode is started, and the high-availability requirement is realized, wherein the specific process is as follows:
when the type structures of the first coupler 11 and the second coupler 21 are both cache type structures, the first coupler 11 and the second coupler 21 set a plurality of database global buffer pools such as a database global buffer pool DB2 GBP1, a database global buffer pool DB2 GBP2, a database global buffer pool DB2 GBP3, and the like through an asynchronous duplex mode.
The type structures of the first coupler 11 and the second coupler 21 are both cache type structures, and are deployed in parallel coupling bodies in different buildings in a duplex mode, so as to improve access efficiency and usability.
When the type structures of the first coupler 11 and the second coupler 21 are both list type structures, the first coupler 11 sets list type structures ixcsr 1, ixcsr 3, and ixcsr 5 in a symmetrical deployment manner, and the second coupler 21 sets list type structures ixcsr 2, ixcsr 4, and ixcsr 6 in a symmetrical deployment manner.
The type structures of the first coupler 11 and the second coupler 21 are list type structures, and the list type structures in the two couplers symmetrically deployed in different buildings are consistent in deployment number.
In the embodiment of the invention, the deployment distribution modes in the first coupler 11 and the second coupler 21 are set through different types of structures of the first coupler 11 and the second coupler 21 so as to improve the access efficiency and the high availability.
To facilitate an understanding of the above-described process in which the disk storage production master 15 and the disk storage production slave 26 are deployed in two different buildings, respectively, the above-described embodiment is described with reference to fig. 5 on the basis of fig. 1.
In fig. 5, the disk storage production host 15 is data-interconnected with the first physical host 12, the second physical host 22, the third physical host 13, and the fourth physical host 23, respectively.
The disk storage production slave disk 26 is data-interconnected with the first physical host 12, the second physical host 22, the third physical host 13, and the fourth physical host 23, respectively.
The first physical host 12 is provided with a GDPS K2 system, and the GDPS K2 system is expected to be a GDPS slave control system (alternate master controlling system).
The K2 disk is connected with the GDPS K2 system data.
The second physical host 22 is provided with a GDPS K1 system, and the GDPS K1 system is expected to be a GDPS main control system (master controlling system).
The K1 disk is connected with the GDPS K1 system data.
The operating system disk volumes of the GDPS K1 system and the GDPS K2 system adopt independent disk storage, the operating system disk volumes of the GDPS K1 system and the GDPS K2 system are separated from the disk bodies of the disk storage production master disk 15 and the disk storage production slave disk 26, the GDPS K1 system and the GDPS K2 system are decoupled from the production disk, and the high availability of the GDPS K1 system and the GDPS K2 system when the production master disk or the production slave disk fails is improved.
If the optical connection between the partition where the GDPS K1 system and the GDPS K2 system are located and the production system logical partition and the disk storage is not distinguishable from the disk body of the disk storage production master disk 15 or the disk storage production slave disk 26, the optical connection between the partition and the production system logical partition and the disk storage is separated, and the optical connection is connected to the disk cabinet by using different physical connection channels, so that the separation is realized at the optical fiber physical connection level.
The disk storage production master 15 is connected with the disk storage production slave 26 through a disk mirror image, and if the disk storage production master 15 fails, the disk storage production slave 26 takes over the service operation of the disk storage production master.
In the embodiment of the invention, the disk storage production master disk 15 and the disk storage production slave disk 26 are respectively deployed in two different buildings, so that the unavailability of all disk storage devices is prevented when building level faults (such as building collapse and the like) occur. When the disk storage production master 15 fails or the building where the disk storage production master 15 is located fails, the disk mirror image hot switching system completes automatic switching between the disk storage production master 15 and the disk storage production slave 26, and the disk storage production slave 26 continues to provide data access service to the outside.
It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
The foregoing is merely a preferred embodiment of the present invention and it should be noted that modifications and adaptations to those skilled in the art may be made without departing from the principles of the present invention, which are intended to be comprehended within the scope of the present invention.

Claims (9)

1. A high availability host system, the system comprising a first host module and a second host module disposed in different buildings;
the first host module adopts a preset deployment mode, and is in data interconnection with the second host module through a physical corridor and physical optical fibers arranged in a vertical shaft, the first host module and the second host module form a parallel coupling body, the first host module and the second host module are respectively deployed in two buildings with X distances, and the value of X is less than or equal to 150 meters;
if any one of the first host module and the second host module fails, the host module which does not fail takes over the service operation work of the failed host module;
the first host module adopts a preset deployment mode, and performs data interconnection with the second host module through a physical corridor established in advance and a physical optical fiber arranged in a vertical shaft, and the method comprises the following steps:
the first host module adopts a corridor top end deployment mode and a corridor bottom end deployment mode, and performs data interconnection with the second host module through physical optical fibers arranged in a pre-established physical corridor;
and the first host module adopts a vertical shaft deployment mode and is in data interconnection with the second host module through a physical optical fiber arranged in a pre-established vertical shaft.
2. The system of claim 1, wherein the first host module comprises a first coupler, a first physical host, a third physical host, a K2 disk, a master clock server, and a disk storage production master;
the master clock server is arranged on the first coupler;
the first coupler is respectively connected with the first physical host and the third physical host;
the K2 disk is connected with the first physical host;
the disk storage production master is respectively connected with the first physical host and the third physical host.
3. The system of claim 2, wherein the second host module comprises a second coupler, a second physical host, a fourth physical host, a K1 disk, a standby clock server, an arbitrated clock server, and a disk storage production slave;
the standby clock server is arranged on the second coupler;
the arbitration clock server is arranged on the fourth physical host;
the second coupler is respectively connected with the second physical host and the fourth physical host;
the K1 magnetic disk is connected with the second physical host;
the disk storage production slave disk is connected to the second physical host and the fourth physical host, respectively.
4. A system according to claim 3, wherein the first coupler is connected to the second coupler, and the second coupler takes over service operation of the first coupler if the first coupler fails;
the first coupler is respectively connected with the first physical host, the second physical host, the third physical host and the fourth physical host;
the second coupler is respectively connected with the first physical host, the second physical host, the third physical host and the fourth physical host;
the disk storage production master is respectively connected with the first physical host, the second physical host, the third physical host and the fourth physical host;
the disk storage production slave disk is respectively connected with the first physical host, the second physical host, the third physical host and the fourth physical host;
the master clock server and the standby clock server are subjected to data interconnection, and if the master clock server fails, the arbitration clock server sends a failure reminding instruction to the standby clock server so that the standby clock server takes over the service operation work of the master clock server;
and the disk storage production master disk is connected with the disk storage production slave disk through a disk mirror image, and if the disk storage production master disk fails, the disk storage production slave disk takes over the service operation work of the disk storage production master disk.
5. The system of claim 4, wherein the type structures of the first coupler and the second coupler are LOCK type structures, wherein a database global LOCK DB2 LOCK1 and a database component DB2 SCA are provided in the first coupler, and wherein a global LOCK GRS is provided in the second coupler.
6. The system of claim 4, wherein the type structures of the first coupler and the second coupler are cache type structures, and the first coupler and the second coupler set up a plurality of global buffer pools of the database in an asynchronous duplex mode.
7. The system of claim 4, wherein the type structures of the first coupler and the second coupler are list type structures, wherein the first coupler is configured to set list type structure IXCSTR1, list type structure IXCSTR3, and list type structure IXCSTR5 in a symmetrical deployment manner, and wherein the second coupler is configured to set list type structure IXCSTR2, list type structure IXCSTR4, and list type structure IXCSTR6 in a symmetrical deployment manner.
8. The system of claim 4, wherein a disk mirror hot-swap control system GDPS K2 is provided in the first physical host.
9. The system of claim 4, wherein a disk mirror hot-swap control system GDPS K1 is provided in the second physical host.
CN202110591338.9A 2021-05-28 2021-05-28 High-availability host system Active CN113220519B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110591338.9A CN113220519B (en) 2021-05-28 2021-05-28 High-availability host system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110591338.9A CN113220519B (en) 2021-05-28 2021-05-28 High-availability host system

Publications (2)

Publication Number Publication Date
CN113220519A CN113220519A (en) 2021-08-06
CN113220519B true CN113220519B (en) 2024-04-16

Family

ID=77099068

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110591338.9A Active CN113220519B (en) 2021-05-28 2021-05-28 High-availability host system

Country Status (1)

Country Link
CN (1) CN113220519B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115378464B (en) * 2022-08-12 2023-08-15 江苏德是和通信科技有限公司 Main and standby machine synthesis switching system of transmitter

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0869586A (en) * 1994-08-30 1996-03-12 Nippon Steel Corp Building automation system
JPH10190716A (en) * 1996-12-24 1998-07-21 Matsushita Electric Works Ltd Distributed integrated wiring system
KR20000056861A (en) * 1999-02-26 2000-09-15 문성주 Building automation system
EP2077500A1 (en) * 2007-12-28 2009-07-08 Bull S.A.S. High-availability computer system
CN102325192A (en) * 2011-09-30 2012-01-18 上海宝信软件股份有限公司 Cloud computing implementation method and system
CN107592159A (en) * 2017-09-29 2018-01-16 深圳达实智能股份有限公司 A kind of intelligent building optical network system and optical network apparatus
CN108351823A (en) * 2015-10-22 2018-07-31 Netapp股份有限公司 It realizes and automatically switches
CN108416957A (en) * 2018-03-20 2018-08-17 安徽理工大学 A building security monitoring system
CN111478450A (en) * 2020-06-24 2020-07-31 南京长江都市建筑设计股份有限公司 Building electrical comprehensive monitoring protection device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10505756B2 (en) * 2017-02-10 2019-12-10 Johnson Controls Technology Company Building management system with space graphs

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0869586A (en) * 1994-08-30 1996-03-12 Nippon Steel Corp Building automation system
JPH10190716A (en) * 1996-12-24 1998-07-21 Matsushita Electric Works Ltd Distributed integrated wiring system
KR20000056861A (en) * 1999-02-26 2000-09-15 문성주 Building automation system
EP2077500A1 (en) * 2007-12-28 2009-07-08 Bull S.A.S. High-availability computer system
CN102325192A (en) * 2011-09-30 2012-01-18 上海宝信软件股份有限公司 Cloud computing implementation method and system
CN108351823A (en) * 2015-10-22 2018-07-31 Netapp股份有限公司 It realizes and automatically switches
CN107592159A (en) * 2017-09-29 2018-01-16 深圳达实智能股份有限公司 A kind of intelligent building optical network system and optical network apparatus
CN108416957A (en) * 2018-03-20 2018-08-17 安徽理工大学 A building security monitoring system
CN111478450A (en) * 2020-06-24 2020-07-31 南京长江都市建筑设计股份有限公司 Building electrical comprehensive monitoring protection device

Also Published As

Publication number Publication date
CN113220519A (en) 2021-08-06

Similar Documents

Publication Publication Date Title
JP5102901B2 (en) Method and system for maintaining data integrity between multiple data servers across a data center
CN110083662B (en) Double-living framework construction method based on platform system
CN100470494C (en) Cluster availability management method and system
CN110912991A (en) Super-fusion-based high-availability implementation method for double nodes
CN109446169B (en) Double-control disk array shared file system
US20030046602A1 (en) Data storage system
US20120079090A1 (en) Stateful subnet manager failover in a middleware machine environment
JP2018500648A (en) Smooth controller change in redundant configuration between clusters
US20060179218A1 (en) Method, apparatus and program storage device for providing geographically isolated failover using instant RAID swapping in mirrored virtual disks
CN107733684A (en) A kind of multi-controller computing redundancy cluster based on Loongson processor
CN109474465A (en) A kind of method and system of the high availability that can dynamically circulate based on server cluster
US20110078396A1 (en) Remote copy control method and system in storage cluster environment
CN113220519B (en) High-availability host system
CN110719282B (en) A dual-active authentication system based on unified authority
CN106850269A (en) A kind of management system of cloud platform
CN201584980U (en) Data centre centralized control switching system
US20200387575A1 (en) Migrating virtual machines using asynchronous transfer and synchronous acceleration
CN113626252A (en) City-level disaster recovery method and device based on cluster, electronic equipment and medium
CN113342261A (en) Server and control method applied to same
CN112783694A (en) Long-distance disaster recovery method for high-availability Redis
US7752493B2 (en) High reliability system, redundant construction control method, and program
CN112685234B (en) Financial grade two-place three-center high-availability MySQL database implementation method
CN114706714A (en) Method for synchronizing computer memory division snapshots
CN114416501A (en) Storage double-activity and test system and method
TWI669605B (en) Fault tolerance method and system for virtual machine group

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant