Summary of the invention
The object of the present invention is to provide a kind of credible redundant fault-tolerant computer system, by adding TCM root of trust module often overlapping in subsystem, the software and hardware defencive function of computing machine being provided, guaranteeing the secure and trusted of computing machine; Carry out System Fault Tolerance by Redundant backup technology, improve the mission reliability of machine system.
For reaching above object, the present invention implants TCM module construction root of trust in the hardware platform of every suit subsystem, and from physical hardware bottom, trust is extended to user application layer face, for user provides believable execution environment guarantee by trust chain mechanism; , adopt plug-in card start-up mode to set up system boot when computer booting starts before, force ID authentication mechanism, in case locking system is falsely used by stranger; The cryptographic service function utilizing TCM to provide, to the protection of process and the sensitive data that stores in addition hardware level, prevents malicious user to the destruction of confidential data and steals.
In reliability, by configuring two cover trusted computer subsystem composition hot backup systems in a cabinet, realize redundancy fault-tolerant.Two cover trusted computer subsystems in cabinet, composition A, B machine system, has heartbeat detection and data synchronization mechanism between two cover systems.Only have a machine to participate in business online at ordinary times, taken over job by another machine when a machine fault.Be responsible for finding fault and performing failover by heartbeat server.After failover, service and application will continue to run at another machine, and application program can according to the checkpoint Information recovering preserved in database to nearest running status.Complete interface circuit by failover module to switch.
Embodiment
Referring to Fig. 1, native system have employed two redundancy/bis-Active/ high-availability cluster mode, adopts CPCI framework, power acquisition 1+1 redundant mode.
Referring to Fig. 2, often overlapping in trusted computer subsystem, by expansion medium scale FPGA between TCM and mainboard BIOS, CPU, realizing bus interface conversion between TCM, BIOS and processor system and bus switch controls.Specific as follows:
1) TCM is connected by SPI and FPGA, is converted to after LPC, is connected with BIOS, CPU through FPGA, realizes initiatively tolerance.
2) TCM self-defined bus interface PSRAM is converted to PCI by FPGA.CPU can by pci bus realize to the access of the trusted service of TCM with call.
3) cpu reset signal is linked into FPGA, by the control of TCM.When powering on, TCM first starts, and makes CPU be in reset mode.
4) BIOS is articulated on FPGA by LPC, and FPGA is bi-directionally connected by LPC and CPU.BIOS and CPU isolates by the lpc bus switch of FPGA inside.
5) after TCM starts, first measure BIOS, measure by rear, the lpc bus switch of FPGA inside closes.CPU reads the startup configuration information of BIOS, and system normally starts.
Trusted computer subsystem A and trusted computer subsystem B works simultaneously, but only has a subsystem to participate in service operation simultaneously.Participate in the subsystem of service operation and be called subsystem, be another set ofly in open state but the subsystem not running business is called backup subsystem.Switching between active and standby subsystem can carry out manual triggers by outside change-over switch, or is automatically triggered after subsystem operation exception thinking by Heart-Beat Technology.
Referring to Fig. 3, manual triggers mode realizes failover for operating personnel need manually to press outside change-over switch according to actual conditions.When outside change-over switch disconnects (default conditions), Control end is low level, and internal control signal is connected to GND end by relay, and now switch-over control signal exports as low level; When outside change-over switch closes, Control end is high level, and internal control signal is connected to+5V by relay, and now switch-over control signal exports as high level.Failover module switches according to the outside switch-over control signal interface circuit received.
Referring to Fig. 4, be configured with heartbeat server often overlapping in trusted computer subsystem.Heartbeat server operates in kernel spacing with the form of system-level process, the running status (operating conditions or ossified state) of real-time detection local terminal application and service, and intercept to set one's heart terminal system and jump the heartbeat that sends of server, send the heartbeat of party B to opposite end simultaneously.According to the testing result of local terminal heartbeat and opposite end heartbeat, heartbeat server can wake up or dormancy process related application or service.In order to improve the reliability of heartbeat path, system have employed the dual heartbeat detection path of network interface UDP+COM mouth.
Further, in order to increase the accuracy rate of heartbeat detection, avoid because the excessively busy heartbeat timeout caused of network congestion or system, system no problem is caused to switch, the heartbeat detection mode that heartbeat server adopts PUSH and PULL to combine: adopt PUSH mode to detect heartbeat mutually under normal circumstances, then automatically transferring PULL mode to when intercepting the heartbeat sent less than opposite end, by initiatively inquiring, further detection being done to heartbeat.
In order to ensure by backup subsystem adapter control smoothly, can to ensure by database mirroring engine the duty can understanding mutually both sides each other between subsystem when subsystem breaks down.Subsystem is periodically by various for system important hardware status data and running software data write local data base, database mirroring engine cycle ground by data syn-chronization in backup subsystem, backup subsystem can obtain the various data messages from subsystem by access local database at any time, to realize failover smoothly when subsystem fault.
Active and standby subsystem control is in systems in which not reciprocity, and when subsystem runs business, backup subsystem does not export control, and now subsystem can control system cloud gray model completely.When only having generation failover, backup subsystem just utilizes rapidly the system state data stored to carry out in-situ FTIR spectroelectrochemitry, the control of adapter system, thus guarantee business is run continually and steadily.
Referring to Fig. 5, when subsystem fault, heartbeat server is responsible for by the service that running in subsystem and application recovery to backup subsystem, and rejuvenation is as follows:
1) backup subsystem is according to the list of the service run in the subsystem recorded before fault, starts or wake corresponding service up in backup subsystem, thus realizes the smooth recovery of service.
2) backup subsystem is according to the list of the application run in the subsystem recorded before fault, starts or wake corresponding application up in backup subsystem.Application can according to the checkpoint Information recovering preserved in database before fault to nearest running status.
3) backup subsystem is connected with external interface by failover module in charge.
4) backup subsystem becomes subsystem.
5) subsystem broken down, after fault is repaired, will rejoin as backup subsystem, perform backup machine function.
Referring to Fig. 6, failover module realizes the handoff functionality to trusted computer subsystem A and subsystem B external interface, forms primarily of control module, interface signal commutation circuit A, interface signal commutation circuit B.Commutation circuit adopts Redundancy Design, and have the commutation circuit that two covers are identical, wherein the external interface of subsystem A is connected to commutation circuit A, and the external interface of subsystem B is connected to commutation circuit B.Commutation circuit A is by the Power supply of subsystem A, and commutation circuit B is by the Power supply of subsystem B.Control module receives from the switching command of heartbeat server or external switch signal, realizes interface and switches.Specific as follows:
1) when receiving the switching command from server, control module sends enable control signal, makes the current commutation circuit being in connection status be in vacant state, makes the current commutation circuit being in vacant state be in connection status
2) when receiving the signal from change-over switch, control module sends enable control signal, makes the current commutation circuit being in connection status be in vacant state, makes the current commutation circuit being in vacant state be in connection status.Meanwhile, control module sends handoff notification message to heartbeat server, is completed the switching of upper layer application and service by heartbeat server.