CN112084135A - High-reliability computer based on domestic processor - Google Patents
High-reliability computer based on domestic processor Download PDFInfo
- Publication number
- CN112084135A CN112084135A CN202010984124.3A CN202010984124A CN112084135A CN 112084135 A CN112084135 A CN 112084135A CN 202010984124 A CN202010984124 A CN 202010984124A CN 112084135 A CN112084135 A CN 112084135A
- Authority
- CN
- China
- Prior art keywords
- switching
- computing unit
- computer
- module
- unit
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000012544 monitoring process Methods 0.000 claims description 3
- 239000000126 substance Substances 0.000 claims description 3
- 230000001360 synchronised effect Effects 0.000 claims description 3
- 230000006870 function Effects 0.000 description 5
- 238000010586 diagram Methods 0.000 description 1
- 238000011835 investigation Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/16—Combinations of two or more digital computers each having at least an arithmetic unit, a program unit and a register, e.g. for a simultaneous processing of several programs
- G06F15/163—Interprocessor communication
- G06F15/173—Interprocessor communication using an interconnection network, e.g. matrix, shuffle, pyramid, star, snowflake
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/14—Error detection or correction of the data by redundancy in operation
- G06F11/1402—Saving, restoring, recovering or retrying
- G06F11/1446—Point-in-time backing up or restoration of persistent data
- G06F11/1448—Management of the data involved in backup or backup restore
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computer Hardware Design (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Quality & Reliability (AREA)
- Hardware Redundancy (AREA)
Abstract
The invention provides a high-reliability computer based on a domestic processor, which belongs to the technical field of domestic high-reliability reinforced computers, and is characterized in that more than 2 computing units and 1 switching unit are arranged in the computer, and each computing unit is configured with high-availability cluster management software; the switching unit realizes the switching of the external interface. The stability of the computer can be greatly improved.
Description
Technical Field
The invention relates to the field of domestic high-reliability ruggedized computers, in particular to a domestic processor-based high-reliability computer.
Background
In a special industry, the stability of a computer is of vital importance, in fact, computer hardware and software inevitably have faults, and according to investigation, hardware faults of a CPU, a memory, a disk and the like of a system, software faults of an operating system, an application program of a client and the like and manual misoperation which are frequently encountered in actual work are found. If a fault occurs in a particular situation, the consequences are immeasurable.
In most production service systems at present, service systems are deployed in a single machine environment, and each system is responsible for supporting daily important services of users, so that a single-point fault risk exists.
Disclosure of Invention
In order to solve the technical problems, the invention provides a high-reliability computer based on a domestic processor.
The technical scheme of the invention is as follows:
a high-reliability computer based on a domestic processor,
more than 2 computing units and 1 switching unit are arranged in a computer, and each computing unit is configured with high-availability cluster management software; the switching unit realizes the switching of the external interface.
Further, in the above-mentioned case,
the computing units are deployed with high-availability cluster management software, and application data are synchronized to the backup computing unit in real time by adopting a data-based real-time copying mode through two groups of gigabit network heartbeat signals between the computing units, so that an integral application protection solution is provided for users.
When a fault is detected, the service system is automatically switched to the backup computing unit, and simultaneously, the CPU GPIO is driven and controlled under the system to send a switching command to the switching unit, and the switching unit is matched with the switching unit to complete the switching control of the interface.
The switching unit is provided with two parts of circuits, one part is a switching control circuit, and the other part is an interface switching circuit.
Further, in the above-mentioned case,
the switching control circuit receives the heartbeat signal of the computing unit through the GPIO to judge the state of the computing unit and controls the switching circuit to realize the switching of the external interface when the computing unit fails.
Further, in the above-mentioned case,
the interface switching circuit completes the switching of the external interface according to the instruction of the switching control circuit; when a signal requesting for switching is received, the switching function of the computing unit is quickly completed, and the external interface is ensured to be always kept in place in the main computing unit.
Further, in the above-mentioned case,
the computing unit adopts an LS3A3000+7A1000 structure and outputs network, USB, serial ports and video signals to the outside.
Further, in the above-mentioned case,
the high-availability software consists of four software parts, including a driving module, an agent module, a service module and a management module;
wherein the content of the first and second substances,
the driving module is embedded into a kernel file of the system, plays a role in monitoring the system in real time, filters useful data, transmits the useful data to the service module for backup, and informs the switching module to switch an external interface by controlling a CPU GPIO signal, so that the use of a user side is always on the main computing unit;
the agent module monitors the client application and transmits information to the service module;
the service module is responsible for executing commands from users/software, transmitting application and user data, ensuring that source data is consistent with backup data, processing faults and the like, and must be executed by matching with a driver;
the management module provides an interface type management platform, can check the working state of the computing unit, hardware information and the like in real time, and records the backup log.
The invention has the advantages that
The computing units are configured with high-availability cluster management software, and are matched with the switching unit to realize the redundant hot standby of the two computing units, so that the stability of the computer can be greatly improved.
Drawings
FIG. 1 is a schematic workflow diagram of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer and more complete, the technical solutions in the embodiments of the present invention will be described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention, and based on the embodiments of the present invention, all other embodiments obtained by a person of ordinary skill in the art without creative efforts belong to the scope of the present invention.
On the basis of the existing computer, a corresponding physical computer is newly added, and a dual-computer high-availability cluster solution is deployed so as to ensure the real-time backup of the core data of the database and the continuous operation of the service.
Highly reliable ruggedized computers employ redundant computing units to ensure that computer information systems provide uninterrupted service to maintain availability of the system.
The invention provides a high-reliability computer based on a domestic processor, which needs the computer at least internally comprising 2 computing units and 1 switching unit, wherein each computing unit is configured with high-availability cluster management software and is matched with the switching unit to realize the redundant hot standby of the two computing units, so that the stability of the computer can be greatly improved.
The computing units are deployed with high-availability cluster management software, and application data are synchronized to the backup computing unit in real time by adopting a data-based real-time copying mode through two groups of gigabit network heartbeat signals between the computing units, so that an integral application protection solution is provided for a user;
when faults of hardware, software, OS, network and the like are detected, the service system is automatically switched to the backup computing unit, meanwhile, a switching command is sent to the switching unit through a driving control CPU GPIO under the system, and the switching unit is matched with the switching unit to complete switching control of the interface.
The switching unit realizes the switching of the video interface, the USB interface, the serial port and other external interfaces, and the switching function board card is provided with two parts of circuits, wherein one part is a switching control circuit and the other part is an interface switching circuit;
wherein
The switching control circuit has the main functions of judging the state of the computing unit by receiving the heartbeat signal of the computing unit through the GPIO and controlling the switching circuit to realize the switching of an external interface when the computing unit fails;
the interface switching circuit has the main function of completing the switching of the external interface according to the instruction of the switching control circuit. When a signal requesting for switching is received, the switching function of the computing unit is quickly completed, and the external interface is ensured to be always kept in place in the main computing unit.
The computing unit adopts an LS3A3000+7A1000 structure, and for an external output network, a USB, a serial port and a video signal, the switching unit uses a single chip microcomputer and a switching chip to perform interface switching.
The high-availability software consists of four software parts including a driving module, an agent module, a service module and a management module.
Wherein the content of the first and second substances,
the driving module is embedded into a kernel file of the system, plays a role in monitoring the system in real time, filters useful data, transmits the useful data to the service module for backup, and informs the switching module to switch an external interface by controlling a CPU GPIO signal, so that the use of a user side is always on the main computing unit;
the agent module monitors the client application and transmits information to the service module;
the service module is responsible for executing commands from users/software, transmitting application and user data, ensuring that source data is consistent with backup data, processing faults and the like, and must be executed by matching with a driver;
the management module provides an interface type management platform, can check the working state of the computing unit, hardware information and the like in real time, and records the backup log.
The above description is only a preferred embodiment of the present invention, and is only used to illustrate the technical solutions of the present invention, and not to limit the protection scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.
Claims (8)
1. A high-reliability computer based on a domestic processor is characterized in that,
more than 2 computing units and 1 switching unit are arranged in a computer, and each computing unit is configured with high-availability cluster management software; the switching unit realizes the switching of the external interface.
2. The computer of claim 1,
the computing units are deployed with high-availability cluster management software, and application data are synchronized to the backup computing unit in real time by adopting a data-based real-time copying mode through two groups of gigabit network heartbeat signals between the computing units, so that an integral application protection solution is provided for users.
3. The computer of claim 2,
when a fault is detected, the service system is automatically switched to the backup computing unit, and simultaneously, the CPU GPIO is driven and controlled under the system to send a switching command to the switching unit, and the switching unit is matched with the switching unit to complete the switching control of the interface.
4. The computer of claim 1,
the switching unit is provided with two parts of circuits, one part is a switching control circuit, and the other part is an interface switching circuit.
5. The computer of claim 4,
the switching control circuit receives the heartbeat signal of the computing unit through the GPIO to judge the state of the computing unit and controls the switching circuit to realize the switching of the external interface when the computing unit fails.
6. The computer of claim 4,
the interface switching circuit completes the switching of the external interface according to the instruction of the switching control circuit; when a signal requesting for switching is received, the switching function of the computing unit is quickly completed, and the external interface is ensured to be always kept in place in the main computing unit.
7. The computer of claim 2 or 3,
the computing unit adopts an LS3A3000+7A1000 structure and outputs network, USB, serial ports and video signals to the outside.
8. The computer of claim 1,
the high-availability software consists of four software parts, including a driving module, an agent module, a service module and a management module;
wherein the content of the first and second substances,
the driving module is embedded into a kernel file of the system, plays a role in monitoring the system in real time, filters useful data, transmits the useful data to the service module for backup, and informs the switching module to switch an external interface by controlling a CPU GPIO signal, so that the use of a user side is always on the main computing unit;
the agent module monitors the client application and transmits information to the service module;
the service module is responsible for executing commands from users/software, transmitting application and user data, ensuring that source data is consistent with backup data, processing faults and the like, and must be executed by matching with a driver;
the management module provides an interface type management platform, can check the working state of the computing unit, hardware information and the like in real time, and records the backup log.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010984124.3A CN112084135A (en) | 2020-09-18 | 2020-09-18 | High-reliability computer based on domestic processor |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010984124.3A CN112084135A (en) | 2020-09-18 | 2020-09-18 | High-reliability computer based on domestic processor |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112084135A true CN112084135A (en) | 2020-12-15 |
Family
ID=73736579
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010984124.3A Pending CN112084135A (en) | 2020-09-18 | 2020-09-18 | High-reliability computer based on domestic processor |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112084135A (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102708028A (en) * | 2012-05-18 | 2012-10-03 | 中国人民解放军第二炮兵装备研究院第四研究所 | Trusted redundant fault-tolerant computer system |
CN103106126A (en) * | 2013-01-16 | 2013-05-15 | 浪潮电子信息产业股份有限公司 | High-availability computer system based on virtualization |
US20160283335A1 (en) * | 2015-03-24 | 2016-09-29 | Xinyu Xingbang Information Industry Co., Ltd. | Method and system for achieving a high availability and high performance database cluster |
CN106815093A (en) * | 2015-11-30 | 2017-06-09 | 北京宇航系统工程研究所 | A kind of computer glitch fault tolerance facility based on interconnection between domestic Loongson processor |
CN110784350A (en) * | 2019-10-25 | 2020-02-11 | 北京计算机技术及应用研究所 | Design method of real-time available cluster management system |
-
2020
- 2020-09-18 CN CN202010984124.3A patent/CN112084135A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102708028A (en) * | 2012-05-18 | 2012-10-03 | 中国人民解放军第二炮兵装备研究院第四研究所 | Trusted redundant fault-tolerant computer system |
CN103106126A (en) * | 2013-01-16 | 2013-05-15 | 浪潮电子信息产业股份有限公司 | High-availability computer system based on virtualization |
US20160283335A1 (en) * | 2015-03-24 | 2016-09-29 | Xinyu Xingbang Information Industry Co., Ltd. | Method and system for achieving a high availability and high performance database cluster |
CN106815093A (en) * | 2015-11-30 | 2017-06-09 | 北京宇航系统工程研究所 | A kind of computer glitch fault tolerance facility based on interconnection between domestic Loongson processor |
CN110784350A (en) * | 2019-10-25 | 2020-02-11 | 北京计算机技术及应用研究所 | Design method of real-time available cluster management system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11681566B2 (en) | Load balancing and fault tolerant service in a distributed data system | |
US20190303255A1 (en) | Cluster availability management | |
US7028218B2 (en) | Redundant multi-processor and logical processor configuration for a file server | |
US7617360B2 (en) | Disk array apparatus and method of controlling the same by a disk array controller having a plurality of processor cores | |
US9823955B2 (en) | Storage system which is capable of processing file access requests and block access requests, and which can manage failures in A and storage system failure management method having a cluster configuration | |
US20160147540A1 (en) | Server system | |
US20140122816A1 (en) | Switching between mirrored volumes | |
TW200703022A (en) | Internet SCSI communication via UNDI services | |
US11573737B2 (en) | Method and apparatus for performing disk management of all flash array server | |
CN104102559A (en) | Redundant heartbeat link and opposite-end restarting link based double-controller storage system | |
CN111767244A (en) | Dual-redundancy computer equipment based on domestic Loongson platform | |
US8099634B2 (en) | Autonomic component service state management for a multiple function component | |
US7797394B2 (en) | System and method for processing commands in a storage enclosure | |
CN109117342A (en) | A kind of server and its hard disk health status monitoring system | |
CN210295047U (en) | Optical fiber KVM system with double backup functions | |
US20040059862A1 (en) | Method and apparatus for providing redundant bus control | |
CN113342261A (en) | Server and control method applied to same | |
US20070294600A1 (en) | Method of detecting heartbeats and device thereof | |
CN212541329U (en) | Dual-redundancy computer equipment based on domestic Loongson platform | |
CN110740066B (en) | Seat-invariant cross-machine fault migration method and system | |
CN112084135A (en) | High-reliability computer based on domestic processor | |
US11210034B2 (en) | Method and apparatus for performing high availability management of all flash array server | |
CN113535471A (en) | Cluster server | |
US10664429B2 (en) | Systems and methods for managing serial attached small computer system interface (SAS) traffic with storage monitoring | |
CN110752955A (en) | Seat invariant fault migration system and method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |