CN101542432A - Replacing system hardware - Google Patents

Replacing system hardware Download PDF

Info

Publication number
CN101542432A
CN101542432A CNA2007800429496A CN200780042949A CN101542432A CN 101542432 A CN101542432 A CN 101542432A CN A2007800429496 A CNA2007800429496 A CN A2007800429496A CN 200780042949 A CN200780042949 A CN 200780042949A CN 101542432 A CN101542432 A CN 101542432A
Authority
CN
China
Prior art keywords
unit
breaks down
operating system
subregion
local operating
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CNA2007800429496A
Other languages
Chinese (zh)
Inventor
A·J·瑞茨
S·S·约德
E·D·沃克尔
S·A·韦斯特
M·G·特里克尔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Microsoft Corp
Original Assignee
Microsoft Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Microsoft Corp filed Critical Microsoft Corp
Publication of CN101542432A publication Critical patent/CN101542432A/en
Pending legal-status Critical Current

Links

Images

Abstract

A method and apparatus for managing spare partition units in a partitionable computing device is disclosed. The method comprises detecting if a spare partition unit is required for addition or replacement in a local operating system and if a spare partition unit is required for addition, initiating an addition of a spare partition unit. If a spare partition unit is required for replacement, a replacement of a failing partition unit with a spare partition unit is initiated; part of the memory of the failing partition unit is passively migrated into the memory of the spare partition unit's partition; part of the memory of the failing partition unit is also actively migrated into the memory of the spare partition unit's partition; and the partitionable computing device is cleaned up. Partition units are replaced without requiring that computer-executable instructions be recompiled.

Description

Replace system hardware
Background
Microprocessor is can be for carry out the electronic equipment of processing and control function such as computing equipments such as desk-top computer, laptop computer, server computer, cell phone, laser printers.Usually, microprocessor comprises little plastics or the ceramic package that comprises and protect the piece of semiconductor material that comprises complicated integrated circuit.The lead-in wire that is connected to integrated circuit is attached on the pin that stretches out from encapsulation, thereby allows this integrated circuit to be connected to other electronic equipment and circuit.Microprocessor is inserted into or otherwise is attached to the circuit board that comprises other electronic equipment usually.
Although microprocessor integrated circuit includes only a computing unit usually, promptly a processor may comprise a plurality of processors in microprocessor integrated circuit.These a plurality of processors that are commonly called " nuclear " are included on the same chip semiconductor material and are connected on the microprocessor package pin.Has the computing power that a plurality of nuclears increase microprocessor.For example, the microprocessor with four nuclears can provide almost the computing power with four monokaryon microprocessor equal quantities.
The use in traditional calculations equipment of a plurality of microprocessors and multi-core microprocessor increases.Traditional calculations equipment is merely able to an example of operation system.Even comprise the example that the traditional calculations equipment of multi-core microprocessor, a plurality of microprocessor or a plurality of multi-core microprocessors also is merely able to the operation system.Yet, utilize the computing power of the increase that multi-core microprocessor provides to allow to carry out before computing function by a plurality of computing equipments execution with less computing equipment.
For example, server be connected to network, to other entity that is connected to this network provide the service or one group the service computing equipment.The server (i.e. 32 tunnel (32way) server) that comprises 32 traditional calculations equipment can comprise 8 microprocessors, and each microprocessor has four nuclears.Further consider this notion, if the ability of each independent nuclear is 8 times of in 32 computing equipments one, then the ability of 32 road servers can be provided by this four nuclear microprocessor.To be computational resource redundancies compared more and can bear with traditional server provides the tangible benefits of this four nuclear servers.In addition, the quantity that reduces microprocessor reduce the cost of server, to the power amount of needed electric weight and the needed maintenance of server of this server.
It is possible using " subregion " to utilize the computing power of multi-core microprocessor better.Subregion is a group of electronic equipment, for example processor, storer etc. in the computing equipment of separate instance (being local operating system) that can the operation system, that can isolate on electricity.But the subregion computing equipment is can be divided into each subregion and thereby can move the computing equipment of a plurality of local operating systems.But but partitioned server is as the subregion computing equipment and thereby can moves the server of a plurality of local operating systems.But the subregion of partitioned server also can be called as " logical server ".That is, for other entity on the network, logical server shows as independently server, even it is not.Also a plurality of servers (logic or other) may be assembled into " server cluster ".Server cluster is a plurality of servers that are equivalent to provide service or one group of unit of serving.
The advantage of use multi-core microprocessor just drives the trend of " server merging ".It is the process of replacing (for example in the server cluster) a plurality of servers with still less server (for example server) that server merges.The server of replacing a plurality of servers comprises the computing power of the ability that equals or exceeds these a plurality of servers usually.Although reduce cost, electric weight and maintenance, server merges and causes the consequence of rising everything on one attempt.Server merges the influence that can increase server failure.For example, if once all moved on same server in a plurality of application programs of moving on a plurality of servers, and this server breaks down, and then the possibility of result influences all application programs.Under the poorest situation, this means application downtime.For preventing this influence, many high-end servers, the server that promptly has a large amount of computing powers is applied to reliability characteristic with the part of its ability.
A kind of such reliability characteristic is " failover (failover) " ability.Failover is the ability that information that first entity was preferably comprised this first entity before this first entity complete failure is delivered to the second similar entity.It has been traditional server, promptly developed the technology of carrying out failover in mode controlled and order, to guarantee during the transfer of replacing server, not having loss of data and do not having ongoing process to be interrupted from the server that breaks down based on the server of traditional calculations equipment.
For creating the same sane and reliable multi-core microprocessor server with traditional server, the similar techniques of operating in the processor level is useful.
General introduction
Provide this general introduction so that some notions that will further describe in the following detailed description with the form introduction of simplifying.This general introduction is not intended to identify the key feature of the theme of required protection, is not intended to be used to help determine the scope of the theme of required protection yet.
But a kind of method and apparatus of spare partition unit that comprises the subregion computing equipment of global management entity and a plurality of local operating systems such as server etc. that is used for managing is disclosed.This method comprises and determines whether to add or to replace spare partition unit in local operating system.Additional spare partition unit then starts spare partition unit and adds process if desired.If need to replace spare partition unit, then start and replace spare partition unit process owing to for example zoning unit breaks down.Replace spare partition unit process and make this spare partition unit move to the subregion of the zoning unit that breaks down passively and on one's own initiative, and the migration action is cleared up.
In an illustrative realized, during spare partition unit interpolation process, global management entity was selected the spare partition unit that will add from overall equipment pond; This global management entity starts the interpolation of selected spare partition unit; Local operating system starts the subregion of selected spare partition unit in this local operating system, i.e. interpolation in this local operating system's subregion; Global management entity is brought selected spare partition unit into this local operating system's subregion; And when finding this selected spare partition unit in this local operating system's subregion in local operating system, this local operating system adds selected spare partition unit in this local operating system's subregion to.
In an illustrative realizes, during the spare partition unit replacement process, when detecting the equipment that breaks down in local operating system, global management entity with this device map that breaks down to physical hardware devices; Global management entity is selected replacement equipment from overall equipment pond; Global management entity starts the replacement of the equipment that breaks down; Local operating system starts the replacement in this local operating system of the equipment break down; Global management entity is brought spare partition unit into subregion in the local operating system, i.e. local operating system's subregion; And during the spare partition unit in discovery local operating system of local operating system subregion, this local operating system prepares this spare partition unit is added in this local operating system's subregion.
In an illustrative realizes, during the passive subregion of moving to the zoning unit that breaks down of spare partition unit, local operating system uses modified logo to follow the tracks of the part of change of the storer of the subregion that breaks down, and shifts the storer of the subregion that breaks down; And the atomic update of global management entity execute store controller routing table.
In an illustrative realizes, with the spare partition unit active migration during the subregion of the zoning unit that breaks down, but global management entity makes the partitioned server pause; The part of the change of the storer of the subregion that local operating system's transfer is broken down; The atomic update of global management entity execute store controller routing table; Local operating system is with the state transitions of the processor of the zoning unit that the breaks down processor to spare partition unit; Local operating system changes the system break state of this local operating system; And the processor of the zoning unit that local operating system stops to break down.
In an illustrative realizes, but during the cleaning partitioned server, but local operating system removes the pause of this partitioned server; Local operating system is notified to global management entity with replacement; And global management entity logically and physically all remove the processor of the zoning unit that breaks down.
Global management entity and local operating system allow interpolation or replace spare partition unit and do not need is that particular platform recompilates computer executable instructions.That is, the realization of this method and apparatus is worked with the executable code of operating system on a plurality of hardware platforms, and need not to revise and recompilate the code of this realization or the code of operating system.
Description of drawings
When in conjunction with the accompanying drawings with reference to following detailed description, can understand above-mentioned aspect of the present invention and many additional advantages more comprehensible and better, in the accompanying drawing:
Fig. 1 is the block diagram that can support the example calculation equipment of zoning unit replacement;
Fig. 2 is the block diagram that comprises the exemplary partitions of a plurality of zoning units, and one of them zoning unit is unconnected;
Fig. 3 is shown in Fig. 2, be reconfigured to the block diagram of the exemplary partitions that comprises previous unconnected zoning unit;
Fig. 4 is the block diagram of exemplary partition unit;
Fig. 5 is the exemplary memory block that breaks down and the block diagram of exemplary replacement memory block;
Fig. 6 is the functional flow diagram that is used for dynamically replacing the illustrative methods of the zoning unit that breaks down;
Fig. 7 is the functional flow diagram of illustrative methods that is used for the interpolation of boot partition unit;
Fig. 8 is the functional flow diagram that is used for moving passively the illustrative methods of replacing zoning unit;
Fig. 9 is the functional flow diagram that is used for moving on one's own initiative the illustrative methods of replacing zoning unit;
Figure 10 is the functional flow diagram that is used for the illustrative methods of clearing up after the zoning unit that replacement is broken down;
Figure 11 is the functional flow diagram that is used to make the illustrative methods of system-down; And
Figure 12 is the functional flow diagram of illustrative methods that is used for the pause of deactivation system.
Describe in detail
Server is to be connected to the computing equipment that other entity network, that be connected to this network to for example computing equipment etc. provides service or one group of service.For example, web page server provides the service of returning webpage in response to web-page requests.Other exemplary servers is for the e-mail server of specific user's return electron email message, returns the video server of the video clipping that files from video etc.The memory block that exemplary servers comprises microprocessor, Memory Controller and controlled by this Memory Controller.Memory Controller and the memory block of being controlled by this Memory Controller are commonly called the unit, i.e. memory cell.Server can also comprise other microprocessor, Memory Controller, memory block and such as other electronic equipments such as interrupt handlers.Therefore, the server that only comprises microprocessor and memory cell should be construed as illustrative rather than restrictive.
So eurypalynous computing equipment is the same, and the operation of server is controlled by the software program that is called as operating system.Traditional calculations equipment can an operation system an example.Therefore, traditional server, promptly based on the server of one or more traditional calculations equipment, the instruction that is comprised in the copy of executive operating system (being the example of operating system).For example, comprise that the server (i.e. 32 road servers) of 32 traditional calculations equipment can comprise 8 microprocessors, each microprocessor has four nuclears and moves an operating system.The quantity that reduces microprocessor reduces the cost of server, to the power amount of needed electric weight and the needed maintenance of server of this server.
Subregion makes and utilizes the computing power of multi-core microprocessor to become possibility better.Subregion is a group of electronic equipment, for example processor, storer etc. in the computing equipment of separate instance (being local operating system) that can the operation system, that can isolate on electricity.But the subregion computing equipment is can be divided into each subregion and thereby can move the computing equipment of a plurality of local operating systems.But but partitioned server is as the subregion computing equipment and thereby can moves the server of a plurality of local operating systems.But the subregion of partitioned server also can be called as " logical server ".Therefore, but a partitioned server can comprise a plurality of logical servers.A plurality of servers (logic or other) can be assembled into " server cluster " of the unit that is equivalent to provide service or one group of service.
Preferably, subregion is dynamic.That is, zoning unit be assigned to the service that server is provided have seldom or do not have the influence subregion or therefrom removed.But can be partitioned server by the server of subregion.But the server system that comprises partitioned server, i.e. system, but be partition system.But partition system is in the quantity and the configuration of zoning unit and distribute on the electronic equipment of subregion dirigibility is provided, and makes that to support " server merging " easier and have more cost efficiency.
It is with still less server or even the process that may only replace a plurality of servers with a server that server merges.The result's who merges as server exemplary servers comprises the computing power of the ability that equals or exceeds a plurality of servers that this server replaces usually.Server merges the influence that can increase server failure.For example, the imagination was once all moved on this server in a plurality of application programs of moving on a plurality of servers.If server breaks down, then the possibility of result influences all application programs and even makes application downtime.
Traditional server is applied to prevent this influence such as reliability characteristics such as " failover " abilities by the part with the computing power of server.Developed the technology of carrying out failover in mode controlled and order, to guarantee during the transfer of replacing server, not having loss of data and do not having ongoing process to be interrupted from the server that breaks down for traditional server.Because traditional server connects by network each other and thereby closely be not held together, so work be divided into small pieces and stride each server and share, promptly packetized.This makes that replacing the server that breaks down becomes easily, because the job sharing of the server that breaks down during failover can be re-routed.Note, must having the traditional server of surpassing to use for realizing failover.That is the traditional server that, breaks down needs to accept from traditional server like the data of this traditional server that breaks down another kind of.
Because but partitioned server can comprise a plurality of logical servers that can more easily communicate by letter than the traditional server that is held together by network, so but partitioned server has the potentiality that reliability is provided than traditional server group easier and more economically.But use subregion process that operate, that be used for the failover of controlled and order in the partitioned server to help the reliability that realizes that but partitioned server can provide.
By notifying each high-level software application program to make that but partitioned server more reliably is unpractical when the needs failover.Make the high-level software application program can respond this notice and will need to revise the computer code of each application program to adapt to failover.Even notification application also may be not enough to replace the mechanism of operating server and failover will be provided.On the contrary, in failover, only relate to bottom software and to allow than upper layer software (applications) (for example, application program) be actual and favourable more as not taking place that hardware change works.
In order, but the realization of rudimentary partitioned server failover relates to global management entity and one or more local operating system.The example of global management entity is service processor (SP) and Baseboard Management Controller (BMC).SP is special microprocessor or the microcontroller that management such as Memory Controller and microprocessor etc. is attached to the electronic equipment of circuit board or mainboard.BMC also is the special-purpose microcontroller that is embedded on the mainboard.Except that managing electronic equipment, BMC monitors from the input that is built into the sensor in the computing system, responds with parameter such as report such as temperature, cooling fan speed, powering mode, operation system state and to it.Other electronic equipment can be fulfiled the role of global management entity.Therefore, use SP or BMC should be construed as illustrative rather than restrictive as global management entity.
Local operating system is the example of the operating system moved on a subregion.Zoning unit is assigned to particular zones and can not shares with the equipment in other subregion to guarantee the equipment in this zoning unit, thereby guarantees that fault will be isolated to single subregion.The physical address that this zoning unit can indicate given Memory Controller to serve, and the Physical Extents unit that thus this physical memory address is mapped to this Memory Controller and comprises this Memory Controller.Guiding and operation subregion may need a more than zoning unit.Zoning unit untapped or that break down can be isolated on electricity.The sequestered partition unit class is similar to from one group of traditional server and removes server on electricity, and has zoning unit and can dynamically be redistributed advantage to different subregions.
In above discussion, unless indicate in addition, zoning unit comprises monokaryon and single memory unit.Yet zoning unit can comprise and surpass nuclear, memory cell, an interrupt handler and/or provide calculation services and/or the miscellaneous equipment of support.Therefore, the use that comprises the nuclear and the zoning unit of Memory Controller should be interpreted as illustrative rather than restrictive.Management (for example add or replace) but zoning unit in the partitioned server allow to carry out failover in controlled and orderly mode, but the same with traditional server sane and reliable to guarantee this partitioned server.
But the example calculation equipment 100 of the partitioned server that is used to realize supporting subregion and zoning unit to add and/or replaces is shown in Figure 1 with the form of block diagram.Example calculation equipment 100 shown in Figure 1 comprises service processor (SP) 102, and it is connected to the storer of storage SP firmware 104 and routing table 106.Computing equipment 100 also comprises the processor A 108 that is connected to memory block A 110, the processor B 112 that is connected to memory block B 114, the processor C116 that is connected to memory block C 118 and the processor D 120 that is connected to memory block D 122.Processor 108,112,116 and 120 each all comprise and be appointed as four nuclears of 0,1,2 and 3.Use routing tables 106 to come isolation boundary between management processor 108,112,126,120 and memory block 110,114,118 and 122 by the SP 102 of SP firmware 104 control.Computing equipment 100 also comprises I/O (I/O) circuit 124, high capacity memory circuit 126, telecommunication circuit 128, environmental circuitry 130 and power supply 132.Computing equipment 100 uses I/O circuit 124 and I/O devices communicating.Computing equipment 100 use high capacity memory circuits 126 with internally or the mass-memory unit that externally is connected mutual.Computing equipment 100 uses telecommunication circuit 128 to come and external device communication by network usually.Computing equipment 100 environment for use circuit 130 equipment that controls environment is as cooling fan, thermal sensor, humidity sensor etc.Power supply 132 is computing equipment 100 power supplies.If for example SP 102 is replaced by BMC, then BMC can with environmental circuitry 130, or rather with power supply 132, communication is also controlled it.
As shown in Figure 1 and support the replacement of zoning unit at computing equipments such as example calculation equipment 100 described above.Exemplary partition unit can form from processor A shown in Figure 1 108 and memory block A 110.Such exemplary partition unit is similar to and comprises Fig. 2 and processor A 202 and the zoning unit that is connected to the memory block of processor A 202 shown in Figure 3.Fig. 2 comprises four identical zoning units with two block diagrams shown in Figure 3.Each of zoning unit all comprises processor and memory block: processor A 202 is connected to memory block 204; Processor B 206 is connected to memory block 208; Processor C 210 is connected to memory block 212; And processor D 214, be connected to memory block 216.
The replacement of zoning unit can be understood by block diagram shown in Figure 2 and block diagram shown in Figure 3 are compared.Although Fig. 2 illustrates four identical zoning units with block diagram among Fig. 3, subregion 200a shown in Figure 2 compares with subregion 200b shown in Figure 3 and comprises different unit, a component district.Subregion 200a shown in Fig. 2 comprises: processor A 202 and memory block 204; Processor B 206 and memory block 208; And processor C 210 and memory block 212.In Fig. 2, comprise that the zoning unit of processor D214 and memory block 216 is not included among the subregion 200a.Opposite with subregion 200a shown in Figure 2, subregion 200b shown in Figure 3 is changed and becomes to comprise different unit, a component district, promptly different one group of processor and memory block.Subregion 200b shown in Figure 3 comprises processor B 206 and memory block 208; Processor C 210 and memory block 212; And processor D 214 and memory block 216.In Fig. 3, comprise that the zoning unit of processor A 202 and memory block 204 is not included among the subregion 200b, and comprise that the zoning unit of processor D 214 and memory block 216 is included among the subregion 200a, as shown in Figure 2.In fact, the zoning unit that comprises processor D 214 and memory block 216 is replaced the zoning unit that comprises processor A 202 and memory block 204.SP 102 changes routing table 106 and replaces.This replacement is broken down, maybe needs to have under the situation of the processor of large memories more at for example processor A 202 and/or memory block 204 be desirable.
Though can constitute zoning unit such as single processors such as processor A 202 and memory block 204 and single memory piece, zoning unit can have other form.Detailed view with a multi-form exemplary partition unit 400 is shown in Figure 4.In Fig. 4, as Fig. 1, exemplary partition unit 400 comprises the processor 402 that comprises four nuclears 0,1,2 and 3 that is connected to Memory Controller 404, and Memory Controller 404 is connected to two memory blocks, that is, memory block A 406 and memory block B 410.Processor 402 communicates with the Memory Controller 404 of control store piece A 406 and memory block B 410.Other zoning unit can comprise the miscellaneous equipment except that processor, Memory Controller and memory block, or can only comprise single processor or single memory controller.Therefore, zoning unit 400 should be interpreted as illustrative rather than restrictive.
Equipment in the typical partition unit (for example processor) can be to the state of local operating system's announcement apparatus.As an alternative or in addition, the local operating system of control zoning unit can use forecast analysis to come the state of assessment apparatus, and determines whether this equipment may break down and can be for the candidate who replaces thus.Come the checkout facility state though can be used as the part of daily servicing such as people such as system managers, it is preferred making hardware oneself notify imminent fault to local operating system.In some cases, processor is upgraded to another model or adds processor and/or storer is desirable to system from a model.Though the system manager can carry out these functions, the instruction by using explicit programming or to come the periodicity time-of-the-day order of report condition to come such replacement of robotization and interpolation by the ability of utilizing subregion, zoning unit and hardware be preferred.
Any equipment (for example memory block) in all zoning unit 400 unit such as subregion such as grade as shown in Figure 4 all may break down.If memory block breaks down, it is preferred then replacing this memory block with equivalent memory block.The exemplary replacement memory block of exemplary memory block that breaks down and equivalence is shown in Figure 5.In Fig. 5, the memory block 500 that breaks down comprises that for example comprises a memory of data unit 504 such as numeral 1,2,3 and 4.Memory cell in the groups of memory cells 504 can use local physical address 502 to quote, and can use global physical address 506 to quote.Local physical address 502 is used address 1000 and 1001, and global physical address 506 is used address 5 and 6.The memory block 500 that breaks down can be replaced by replacing memory block 510.Replace memory block 510 and comprise one group of memory cell 514 that can comprise data ( i.e. numeral 1,2,3 and 4) from the memory block 500 that breaks down.As the memory block that breaks down, the memory cell of replacing in the groups of memory cells 514 in the memory block 510 can use local physical address 512 to quote, and can also use global physical address 516 to quote.Local physical address 512 is used address 2000 and 2001, and global physical address 516 is used address 7 and 8.Replace in the sequence typical, the data in the groups of memory cells 504 of the memory block 500 that breaks down are copied in the groups of memory cells 514 of replacing memory block 510.The local physical address 502 of the memory block 500 that breaks down, promptly 1000 and 1001, be re-mapped to the local physical address 512 of replacing memory block 510, promptly 1000 and 1001.The remainder of this permission system that remaps keeps not being modified during replacement operation.When replacing zoning unit, it is preferred that " atom " of execute store piece upgrades (promptly replacing).During the atomic update of memory block,, then use the address of the memory block that breaks down exclusively or use the address of replacing memory block exclusively if the memory block that is upgrading is accessed.That is, make data access exclusively to memory cell 504 from local physical address 502 or from local physical address 512.
Replacement with replacing 510 pairs of memory blocks that break down 500 of memory block shown in Figure 5 is one type the replacement that can take place when a zoning unit is replaced by another zoning unit.The replacement of zoning unit or add can also relate to another processor, one group of nuclear or nuclear replaces a processor, one group of nuclear or examines, or adds them.But be used for relating to spare partition unit in the subregion of managing local operating system by global management entity in the method for for example on partitioned server, carrying out the replacement of process execution zoning unit of failover and interpolation in mode controlled and order.The example of this method form with functional flow diagram in Fig. 6-12 illustrates.In the functional flow diagram shown in Fig. 6-12, except that using the frame of " (G) " mark, action is to be carried out by local operating system's (promptly controlling the operation system example of a subregion).Comprise by such as the performed action of global management entity such as SP or BMC with the frame of " (G) " mark.
The functional flow diagram of example process that is used to carry out the replacement of zoning unit or interpolation is shown in Figure 6.
Shown in Figure 6 be used to carry out the replacement of zoning unit and the method for interpolation begins at frame 600 places, wherein local operating system uses forecast analysis to detect the zoning unit that breaks down based on hardware ID.Hardware ID can be the Advanced Programmable Interrupt Controllers APICs ID (APIC ID) of for example processor.At decision box 602 places, test to determine whether new zoning unit expects is used for replacing.If new zoning unit expection is used to replace, then control advances to frame 604.If new zoning unit is not expected be not used to replace, i.e. expection is used to add, and then control advances to subroutine 606.At frame 604 places, global management entity is mapped to physical hardware with the zoning unit that breaks down.Control advances to the subroutine 606 of the replacement that is used to the boot partition unit.The interpolation of boot partition unit or the details of replacement is shown in Figure 7 and in following description.At decision box 608 places, determine again whether expection is used to replace new zoning unit.If new zoning unit is not expected be not used to replace, i.e. expection is used to add, and then control advances to frame 612.At frame 612 places, local operating system adds zoning unit to this local operating system, and this method finishes subsequently.If determine that at frame 608 places new subregion expection is used to replace, then control advances to frame 610.At frame 610 places, local operating system prepares to add new zoning unit to this local operating system.For example, can in storer, be provided with APIC ID that data structure writes down the processor that breaks down, memory block that record breaks down size and starting position, maybe the memory block that breaks down is remapped to another memory block.At frame 614 places, local operating system replaces by send next startup of the signal of indicating the installation of replacing zoning unit in this local operating system to global management entity.After frame 614, control advances to subroutine 616.Subroutine 616 is used to move passively the replacement zoning unit.It is shown in Figure 8 and in following description that the details of zoning unit is replaced in migration passively.After carrying out subroutine 616, control advances to subroutine 618.Subroutine 618 is used to move on one's own initiative the replacement zoning unit.It is shown in Figure 9 and in following description that the details of zoning unit is replaced in migration on one's own initiative.After carrying out subroutine 618, control advances to subroutine 620.Subroutine 620 is used to clear up after the replacing it of the zoning unit that breaks down.It is shown in Figure 10 and in following description how to carry out the details of cleaning.After carrying out subroutine 620, this method finishes.
Subroutine 606, promptly the details of the interpolation/replacement of boot partition unit is shown in Figure 7.This subroutine begins at frame 740 places, and wherein global management entity is selected the zoning unit that will add from global partition unit pool.Global partition unit pool is the unit, a component district that global management entity therefrom selects to replace zoning unit.Use hardware partition, global management entity selects have which zoning unit to use to each local operating system.Under the situation of replacing, the replacement zoning unit that the global management entity selection has the ability of the zoning unit that breaks down at least.The exemplary capability that comprises the zoning unit of storer is the size of storer.The exemplary capability that comprises the zoning unit of processor is the quantity of the nuclear in this processor.Replacing zoning unit can be the zoning unit of idle spare, maybe can be by the employed zoning unit of local operating system different, may be not too important.This not too important local operating system can be closed, and the resource of this local operating system (being zoning unit) can be used as standby subsequently.At frame 744 places, local operating system is the boot partition unit in this local operating system.At frame 746 places, global management entity is brought new zoning unit into the local operating system subregion.Particularly, SP 102 remaps new zoning unit in routing table 106.At frame 748 places, local operating system finds the new zoning unit in this local operating system's subregion.Behind frame 748, this subroutine finishes.
Subroutine 616, promptly passively migration replace zoning unit details with exemplary form shown in the process flow diagram shown in Figure 8.Passive and purpose active migration is to shift information as much as possible to replacing zoning unit from the zoning unit that breaks down, and does not close or bother ongoing advanced application.For example, application program may have been shone upon, and promptly asks and received a plurality of memory blocks.This application program not some memory block in using (promptly revising) these a plurality of memory blocks is possible.As long as memory block is not modified, then the content of this memory block can be transferred to the replacement memory block and leave this application program alone.Another memory transfer strategy is to shift state as much as possible, and supposes that most of memory blocks will not be modified.Determine that a kind of mode that whether memory block has been modified is to check each the modified logo of virtual memory page table entry of each page of the physical storage of this memory block.If modified logo is not set up, then this memory block is not modified as yet, and therefore is in transferable state.The certain methods of transfer register content is more efficient than other method.For example, processor must be with in the transfer of content of the memory cell register or unit in this processor, and transfers to subsequently in the new memory cell.Usually, processor is subject to the largest data transfer width of register, for example 64.Can be such as private memory transmission equipments such as direct memory visit (DMA) processors with bigger " chunk (chunk) " and transmission memory piece more quickly usually.Need seldom or not to need processor to interfere to come the content of transmission memory piece such as " background engine " such as dma processors.Preferably, driver model can be finished modified logo inspection and transfer of content in optional mode.
The exemplary process diagram that is used for moving passively the subroutine of replacing zoning unit is shown in Figure 8.This subroutine begins at frame 800 places, wherein uses modified logo to follow the tracks of the storer of change with transfer register.At frame 802 places, local operating system begins to follow the tracks of these pieces that have been modified by the modified logo of checking storage page (being piece).At frame 804 places, the content of target (promptly breaking down) partition unit memory is transferred to the replacement partition unit memory.
Notice that although in the action of execution graph 6 in the frame shown in Figure 8, even when preparing to replace the zoning unit that breaks down in local operating system, the advanced application of operating still can carry out useful work in this local operating system.This is not action performed in subroutine 618, promptly moves the situation of replacing zoning unit on one's own initiative.On the contrary, we can say that this process enters " key component ", wherein disapprove any activity of replacing the necessary action except that finishing.This process is shown in the exemplary functional flow diagram of this key component in Fig. 9 of i.e. active migration replacement zoning unit.Preferably, minimizing the time that is spent in this key component avoids the long-range connection of the application program moved in this local operating system to perceive this active migration.This subroutine begins at subroutine 900 places, and wherein local operating system makes this system " pause ".Make the details of system-down shown in Figure 11 and in following description.In brief, when being paused, stop to interrupt stoping I/O equipment and other processor to interrupt the processor that is replaced, and stop memory modification in system.Continue at frame 902 places of Fig. 9, local operating system shifts modified storer, i.e. the content of transfer register piece and modified logo is set.At frame 904 places, the atomic update of global management entity execute store controller route.At frame 906 places, local operating system preserves the state of the processor that breaks down.At frame 908 places, the processor that local operating system stops to break down.This processor that breaks down is still in subregion, although and be closed and guarantee that the processor that still this breaks down in subregion can not be harmful to.At frame 910 places, local operating system is applied to replace processor with the processor state that breaks down.With the processor state that breaks down be applied to replace processor can comprise can be internally and the register of external reference to what replace that processor shifts the processor that breaks down; The APIC ID that will replace processor atomically changes into the APIC ID of the processor that breaks down; And change interrupt-descriptor table so that will on the replacement processor, trigger in the interruption that triggers on the processor that breaks down.At frame 912 places, local operating system's update system interruption status reflects newly (promptly replacing) processor.That is, revise that the global interrupt state makes the external device access interrupt handler but not the processor that breaks down.At frame 912 places, the subroutine that the active migration of this process is replaced zoning unit finishes.
Subroutine 620 shown in Figure 6, promptly Qing Li details is shown in the exemplary process diagram shown in Figure 10.Clean up subroutine begins at subroutine 1002 places, removes the pause of this system at this.The details of the pause of deactivation system is shown in Figure 12 and in following description.At frame 1004 places of Figure 10, notice global management entity (as shown in Figure 1 be SP 102) is paused and is finished.At frame 1008 places, the zoning unit that breaks down is removed physically.Remove the zoning unit that breaks down and to relate to and mechanically remove various physical equipments, maybe can relate to and on electricity, isolate various physical equipments.If for example processor is inserted in the slot that is electrically connected, then this processor can be by being closed to that this processor is inserted into the power supply of slot wherein or by removing this processor and " removing physically " from this slot.Behind frame 1008, this subroutine finishes.
Although the activity in the frame 1008 promptly removes the physical equipment of the zoning unit that breaks down, can be counted as optionally, but it is preferred.The physical equipment that breaks down still is imported in the routing table of this zoning unit.Therefore, in some cases, the physical equipment that breaks down other assembly in may the upset system.
Subroutine 900 shown in Figure 9 is even the details of system-down is shown in the exemplary process diagram shown in Figure 11.Term " pause " means system is placed inactive state.Make system-down be provided for the environment of the safety of atom replacement.At frame shown in Figure 11 1102 places, processor controls is selected by local operating system, is about to control the processor of pause activity.Processor controls is carried out one group of instruction that realizes system-down and active migration.There are the various algorithms that are used to select processor controls.For example, the least busy processor that will not be replaced with lowest number can be chosen as processor controls.Another processor controls candidate can be the processor that begins this replacement process.It also is possible having a plurality of processor controls, but is not optimum usually.Therefore, single processor controls should be interpreted as illustrative rather than restrictive.
Processor controls is carried out action remaining in this pause subroutine.At frame 1104 places, processor controls stops all interrupting, even physical equipment stops to interrupt the processor that need be replaced.Preferably, physical equipment is paused.Processor controls is communicated by letter with the device driver of control physical equipment.For preventing that physical equipment from trigger interrupting, processor controls can send to device driver stop, dormancy or pending signal.Identical signal can be used to prevent memory access.Preferably, device driver needn't be revised and/or recompilate in system can by pause.At frame 1106 places, processor controls stops all direct memory visits.Device driver is prevented from writing file and carries out DMA.Device driver can be ranked to interruption and DMA request.Exist the edge to trigger and level triggered interrupts.Level triggered interrupts can be lined up.Interrupt not served immediately if the edge triggers, then this interruption is lost.
Continuation is with reference to Figure 11, and at frame 1108 places, processor controls stops the activity in all devices.Preferably, this equipment avoids revising the storer that the processor replaced is using, and avoids revising the state of the processor that just is being replaced.What processor state comprised RS in this processor itself and this processor outside distributes the storer that is used for the storage of processor state exclusively.Generally, stop and the communicating by letter of the zoning unit that will be replaced.At frame 1110 places, processor controls stops all application programs by send the occurent signal of indication operation suspension to each application program.At frame 1112 places, processor controls is used to all other processors in " junction " system.In junction, processor controls makes other processor stop to visit the zoning unit that just is being replaced, the zoning unit that promptly breaks down.Behind frame 1112, this subroutine finishes.If there is the zoning unit that needs interpolation, compare with replacing other zoning unit, then this additive partition unit can add after the pause subroutine.
In join (also being called as gathering), processor controls is interrupted (IPI) order by sending to other processor between processor, make other processor stop to visit the zoning unit that just is being replaced.This IPI is to other processor indication, and other processor should spin on common barrier.That is, stop to carry out application work and on this barrier, spin, till this barrier is changed into the indication application work and should be restarted.The processor that is running application is spinned on barrier prevent application program interference replacement and needn't stop application program by explicitly.Preferably, to provide the chance that responds that exists to application program to suspending with the corresponding to mode of the purpose of this application program.Even application program does not respond to the existence that suspends, when joining the processor of this application program of operation, also automatically prevent this application program interference replacement.
In the illustrative examples that spins on barrier, the mutually same group of each processor execution is instructed and is guaranteed that processor is in other instruction of execution.This instruction instruction processorunit reads an address; If the content at this place, address is not 0, then read this address once more.Content in this address of processor controls is set at 0 o'clock, and each processor strides across this instruction set, and gets back on the ongoing affairs of processor before the spin on the barrier.Although each processor spins on barrier, processor controls can shift the state that can not shift in passive transition state, and can shift modified storer.
In typical the junction, may there be a plurality of stages, each all needs a barrier.For example, in the phase one, processor controls can be provided with first barrier for other processor (being non-processor controls).Although other processor spins on this barrier, the processor controls run time version is arranged on the wherein data structure of preservation state.Processor controls discharges first barrier subsequently and indicates other processor preservation state.Processor controls is provided for second barrier of subordinate phase.When other processor was followed instruction and come preservation state, other processor spinned on second barrier.At the reasonable time place, for example when all other processors preservation state, processor controls discharges second barrier and indicates other processor off-line.
Subroutine 1002 shown in Figure 10, i.e. the details of deactivation system pause is shown in the exemplary process diagram shown in Figure 12.It is the inverse process that makes system-down basically that deactivation system pauses.Releasing is parked on frame 1202 places and begins, and wherein processor controls is used to remove and joins, i.e. other processor of all in the delivery system.At frame 1204 places, restart all application programs.More specifically, processor can be used for dispatching the activity from application program, because this process application programs itself is transparent.At frame 1206 places, restart the activity in all devices.At frame 1208 places, restart all direct memory visits.At frame 1210 places, restart all interruptions.Behind frame 1210, this subroutine finishes.
Allowing local operating system to replace zoning unit with process described above and needn't recompilate shown in Fig. 6-12 for particular device.This process can be implemented to move on the equipment of most computers manufacturer, if this equipment is supported subregion and zoning unit.The firmware that can write and have enough software " hook (hook) " allows the details of specific hardware is extract, thereby avoids writing for each particular device the effort and the cost of firmware.May need firmware but the realization of process in the local operating system does not need to be recompiled, encapsulation or distribution again again.
Although illustrate and described each illustrative embodiment, can recognize, can make various changes therein and do not deviate from the spirit and scope of the present invention.

Claims (20)

1. but manage the method for the spare partition unit in the subregion computing equipment by global management entity for one kind, described method comprises:
Determine whether in local operating system, to add or to replace spare partition unit; And
Add spare partitions if desired, then start the interpolation of spare partition unit, otherwise:
(a) start with the replacement of spare partition unit the zoning unit that breaks down;
(b) passively described spare partition unit is moved in the subregion of the described zoning unit that breaks down;
(c) on one's own initiative described spare partition unit is moved in the subregion of the described zoning unit that breaks down; And
(d) but the cleaning described subregion computing equipment.
2. the method for claim 1 is characterized in that, the interpolation that starts spare partition unit comprises:
(a) described global management entity is selected the spare partition unit that will add from global pool;
(b) described global management entity starts selected spare partition unit in described global management entity;
(c) start selected spare partition unit in the subregion (" local operating system's subregion ") of described local operating system in described local operating system;
(d) described global management entity is brought selected spare partition unit in the described local operating system subregion into; And
When (e) finding selected spare partition unit in the described local operating system subregion in described local operating system, described local operating system adds selected spare partition unit in the described local operating system subregion to.
3. the method for claim 1 is characterized in that, starts with the replacement of spare partition unit to the zoning unit that breaks down to comprise:
(a) when detecting the zoning unit that breaks down, described global management entity is mapped to physical hardware devices with the described zoning unit that breaks down;
(b) described global management entity is selected standby replacement zoning unit from global pool;
(c) described global management entity starts described spare partition unit to described global management entity;
(d) described local operating system starts described spare partition unit in described local operating system;
(e) described global management entity is brought described spare partition unit in the subregion (" described local operating system subregion ") in the described local operating system into; And
When (f) finding described spare partition unit in the described local operating system subregion in described local operating system, described local operating system prepares described spare partition unit is added in the described local operating system subregion.
4. method as claimed in claim 3 is characterized in that, the hardware ID that the described detection that zoning unit is broken down is based on described zoning unit is determined by forecast analysis.
5. the method for claim 1 is characterized in that, passively described spare partition unit is moved in the subregion of the described zoning unit that breaks down to comprise:
(a) described local operating system uses the part of the change of the storer of the described subregion that breaks down of modified logo tracking, shifts the storer of the described subregion that breaks down; And
(b) atomic update of described global management entity execute store control routing table.
6. the method for claim 1 is characterized in that, on one's own initiative described spare partition unit is moved in the subregion of the described zoning unit that breaks down to comprise:
(a) but described global management entity is paused described subregion computing equipment;
(b) part of the change of the storer of the described subregion that breaks down of described local operating system's transfer;
(c) atomic update of described global management entity execute store controller routing table;
(d) described local operating system arrives the state transitions of the processor of the described zoning unit that breaks down the processor of described spare partition unit;
(e) described local operating system changes the system break state of described local operating system; And
(f) described local operating system stops the processor of the described zoning unit that breaks down.
7. method as claimed in claim 6 is characterized in that, but the described partitioned server of cleaning comprises:
(a) but described global management entity is removed the pause of described partitioned server;
(b) described local operating system is notified to described global management entity with described replacement; And
(c) logically remove the processor of the described zoning unit that breaks down.
8. method as claimed in claim 7 is characterized in that, also is included in the processor that physically removes the described zoning unit that breaks down.
9. method as claimed in claim 8 is characterized in that, the processor that removes the described zoning unit that breaks down physically is to finish by described local operating system isolates the described zoning unit that breaks down on electricity processor.
10. the method for claim 1 is characterized in that, the described zoning unit that breaks down is replaced and need not to recompilate computer executable instructions.
11. but computer-readable medium that comprises the computer executable instructions of the spare partition unit that is used for managing the subregion computing equipment, but described subregion computing equipment comprises global management entity and a plurality of local operating system, and described computer executable instructions makes some local operating systems of described global management entity and described a plurality of local operating systems when being performed:
(a) determine whether in local operating system, to add or to replace spare partition unit;
(b) add spare partition unit if desired, then start the interpolation of spare partition unit; Otherwise:
(i) start with the replacement of spare partition unit the zoning unit that breaks down;
(ii) passively described spare partition unit is moved in the subregion of the described zoning unit that breaks down;
(iii) on one's own initiative described spare partition unit is moved in the subregion of the described zoning unit that breaks down; And
But (iv) clear up described subregion computing equipment.
12. computer executable instructions as claimed in claim 11 is characterized in that, the interpolation that starts spare partition unit comprises:
(a) spare partition unit that selection will be added from the pond;
(b) in described global management entity, start selected spare partition unit;
(c) interpolation of the selected spare partition unit of startup in the subregion in local operating system (" described local operating system subregion ");
(d) selected spare partition unit is brought in the described local operating system subregion; And
(e) selected spare partition unit is added in the described local operating system subregion.
13. computer-readable medium as claimed in claim 11 is characterized in that, starts with the replacement of spare partition unit to the zoning unit that breaks down to comprise:
(a) detect the zoning unit that breaks down;
(b) from the pond, select to replace zoning unit;
(c) in described global management entity, start replacement to the described equipment that breaks down;
(d) in described local operating system, start replacement to the described equipment that breaks down;
(e) described spare partition unit is brought in the subregion (" described local operating system subregion ") in the described local operating system; And
(f) prepare described spare partition unit is added in the described local operating system subregion.
14. computer-readable medium as claimed in claim 13 is characterized in that, detects the zoning unit that breaks down and is to use forecast analysis to determine.
15. computer-readable medium as claimed in claim 11 is characterized in that, passively described spare partition unit is moved in the subregion of the described zoning unit that breaks down to comprise:
(a) use the part of the change of the storer of the described subregion that breaks down of modified logo tracking, shift the storer of the described subregion that breaks down; And
(b) atomic update of execute store controller routing table.
16. computer-readable medium as claimed in claim 11 is characterized in that, on one's own initiative described spare partition unit is moved in the subregion of the described zoning unit that breaks down to comprise:
(a) but described partitioned server is paused;
(b) part of the change of the storer of the described part that breaks down of transfer;
(c) on the Memory Controller routing table, carry out atomic update;
(d) with the state transitions of the processor of the described zoning unit that breaks down processor to described spare partition unit;
(e) the system break state of the described local operating system of change; And
(f) stop the processor of the described zoning unit that breaks down.
17. computer-readable medium as claimed in claim 16 is characterized in that, but the described partitioned server of cleaning comprises:
(a) but remove the pause of described partitioned server;
(b) described replacement is notified to described global management entity;
(c) logically remove the processor of the described zoning unit that breaks down; And
(d) remove the processor of the described zoning unit that breaks down physically.
18. computer-readable medium as claimed in claim 17 is characterized in that, the processor that removes the described zoning unit that breaks down physically is to finish by the processor of isolating the described zoning unit that breaks down on electricity.
19. computer-readable medium as claimed in claim 17, it is characterized in that the processor that removes the described zoning unit that breaks down physically is to finish by the processor that the slot from the processor of the described zoning unit that breaks down removes the described zoning unit that breaks down.
20. computer-readable medium as claimed in claim 11 is characterized in that, the described zoning unit that breaks down is replaced and need not to recompilate computer executable instructions.
CNA2007800429496A 2006-11-21 2007-11-20 Replacing system hardware Pending CN101542432A (en)

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
US86681506P 2006-11-21 2006-11-21
US60/866,815 2006-11-21
US60/866,821 2006-11-21
US60/866,817 2006-11-21
US11/675,272 2007-02-15

Publications (1)

Publication Number Publication Date
CN101542432A true CN101542432A (en) 2009-09-23

Family

ID=41124161

Family Applications (1)

Application Number Title Priority Date Filing Date
CNA2007800429496A Pending CN101542432A (en) 2006-11-21 2007-11-20 Replacing system hardware

Country Status (1)

Country Link
CN (1) CN101542432A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102262576A (en) * 2011-07-29 2011-11-30 成都易我科技开发有限责任公司 Method for operating partition in use by using current system
CN106997315A (en) * 2016-01-25 2017-08-01 阿里巴巴集团控股有限公司 A kind of method and apparatus of core dump for virtual machine
CN108027754A (en) * 2015-08-13 2018-05-11 高通股份有限公司 Memory sub-system reduces system downtime during safeguarding in computer processing system

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102262576A (en) * 2011-07-29 2011-11-30 成都易我科技开发有限责任公司 Method for operating partition in use by using current system
CN102262576B (en) * 2011-07-29 2013-04-10 成都易我科技开发有限责任公司 Method for operating partition in use by using current system
CN108027754A (en) * 2015-08-13 2018-05-11 高通股份有限公司 Memory sub-system reduces system downtime during safeguarding in computer processing system
CN108027754B (en) * 2015-08-13 2022-09-02 高通股份有限公司 Computer processing system and method for facilitating maintenance of a computer processing system
CN106997315A (en) * 2016-01-25 2017-08-01 阿里巴巴集团控股有限公司 A kind of method and apparatus of core dump for virtual machine
CN106997315B (en) * 2016-01-25 2021-01-26 阿里巴巴集团控股有限公司 Method and device for memory dump of virtual machine

Similar Documents

Publication Publication Date Title
US7877358B2 (en) Replacing system hardware
US8745441B2 (en) Processor replacement
CN100361083C (en) Information processing system, information processing method, and program
CN101271409B (en) Device and method for migration of a logical partition, and equipment therefor
CN100580631C (en) Method and device for replacing failing physical processor
US7925923B1 (en) Migrating a virtual machine in response to failure of an instruction to execute
CN1770707B (en) Apparatus and method for quorum-based power-down of unresponsive servers in a computer cluster
US9910664B2 (en) System and method of online firmware update for baseboard management controller (BMC) devices
US8473460B2 (en) Driver model for replacing core system hardware
US20080201603A1 (en) Correlating hardware devices between local operating system and global management entity
US7007192B2 (en) Information processing system, and method and program for controlling the same
JP2009252204A (en) Operation management system and operation management method of computer
CN104823160A (en) Virtual machine-preserving host updates
CN104969200A (en) Distributed cache coherency directory with failure redundancy
CN101542433B (en) Transparent replacement of a system processor
CN100470484C (en) Hot-swap processing method based on HPI model
JP2013206379A (en) Cluster monitoring device, cluster monitoring method, and program
CN101542432A (en) Replacing system hardware
CN102141920B (en) Method for dynamically configuring C-State and communication equipment
CN102662702A (en) Equipment management system, equipment management device, substrate management device and substrate management method
JP5299371B2 (en) Method for incorporating new device into information processing device, and information processing device
CN103995731A (en) Management center deployment method and virtual device
JP2002041375A (en) Server system
KR20020053127A (en) Dual control system having mode change quickly accomplished in time

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Open date: 20090923