Background technology
In communication or the field very high such as storage, generally comprise the paired node of redundancy each other in the system that forms by a lot of working nodes to system availability, reliability requirement.Intrasystem working node number is generally even number, and the communication port of supporting user business is arranged between the node.Becoming the low data rate channel that high availability is arranged between two right nodes of redundancy in the system that has, this bottom data channel architecture is simple, and implementation complexity is low, such as common RS-232 serial ports, JTAG debug port etc.
System is in the runtime, may be on hardware the upgrade job node, such as using more high performance veneer to replace the work on hand veneer, this upgrading is called as hard upgrading.Another kind of then be the soft upgrading of node, promptly working node comes into operation the back in its under-stream period, can need the upgrading of code, and the code of upgrading may be the program code of system's operation, also may be the logical code of all kinds of programming devices.What this paper related to is the soft upgrading of node.
To foregoing by the system of a plurality of redundancies to forming, whole if desired upgrade-systems, then paired node can not be upgraded simultaneously in the system, but have successively, this is because in fields such as communication or storage, this redundancy can not at will be interrupted or can not interrupt the customer service of bearing at all.If redundant centering first node upgrading failure, so Section Point just again " risk " upgraded.
Each working node of redundant centering is a micro controller system, generally including but not limited to as the lower part: be used for various storeies, the transceive data of CPU, the storage system operation code of deal with data communication interface, basic input-output system BIOS (Basic Input Output System), with the mutual management interface of user terminal, all kinds of programmable logic device (PLD) (PLD, Programmable Logic Device) that are used for collecting board card information and Treated Base reset interrupt signal etc. etc.
To the node of redundant centering, existing upgrade method is: the user sends CPU to by the management interface used code of will upgrading, and assigns the upgrading order.CPU resolves user command, and code according to the form that can be accepted by concrete device, is write in storage or the logical device by corresponding DLL (dynamic link library).
From then in the process as can be seen, the upgrading of node all is that the CPU by node self finishes.
In above-mentioned node upgrade method, if because certain problem of target devices such as storer, programming device, make CPU confirm that code has write the code that finishes and in fact write and imperfect or contain wrongly, this will cause after a CPU thinks to upgrade normally restarting after finishing this veneer to be hung extremely.And because this type of target devices is being played the part of very important role in system, the key function that powers on, resets etc. that as the PLD device, will take on the whole veneer of control, so the upgrading of this class target devices failure will cause whole veneer can't respond user command once more.
This node upgrading failure causes " risk " upgrading again of redundant another right node, otherwise might cause whole redundancy to interrupt customer service to hanging dead.
For the problem above solving, existing another kind of technology is that veneer has increased a wrong recovery controller.The node upgrade method of this technology is the same with the technology of front, if but having run into when hanging dead problem after node is restarted, the mistake recovery controller will change system over to and repair pattern, refreshes target devices with default value, restarts veneer once more, and system recovery is normal.Run into the problem that the bottom fault can't selfreparing though the prior art has solved single board updating, it has introduced new controller module, has increased the cost of system, the also corresponding complexity that increases system.
Embodiment
For making the purpose, technical solutions and advantages of the present invention clearer, specific embodiments of the invention are elaborated below in conjunction with accompanying drawing.At this, illustrative examples of the present invention and explanation thereof are used to explain the present invention, but not as a limitation of the invention.
Embodiment one
The embodiment of the invention provides the upgrade method of device in a kind of paired redundant structure.As shown in Figure 1, this method comprises the steps:
Step 110 writes upgrade code to second redundancy unit in the structure by this redundancy, and notifies described first redundancy unit to restart in the target devices of redundant right first redundancy unit each other.
In the embodiment of the invention, first redundancy unit and second redundancy unit are redundant right, and this first redundancy unit and second redundancy unit both can be the veneer that constitutes working node, also can be the redundant each other modules in a veneer or the node.
After second redundancy unit writes the target devices of first redundancy unit with whole upgrade codes, notify described first redundancy unit to restart so that upgrade and come into force.
In the present embodiment, described target devices can be system modules such as BIOS, also can be programmable logic device (PLD) (PLD), and this programmable logic device (PLD) comprises complicated programmable logic device (PLD) (CPLD).
The success of whether upgrading of step 120, the described target devices of checking described first redundancy unit.
Under the normal condition, first redundancy unit is restarted the back and is moved with new version of code.If but exist the code of mentioning in the prior art that writes imperfect or contain problems such as wrong this moment, and will cause second redundancy unit to be thought upgrade successfully and the in fact target devices success of not upgrading, promptly promptly extension is extremely after first redundancy unit is restarted.Therefore, after notifying first redundancy unit to restart, the success of whether upgrading of the described target devices that in setting-up time, check described first redundancy unit, the time of this setting can be the maximum time that system normally restarts, but the present invention is not limited to this.
Step 130, if upgrading is unsuccessful, this target devices that the edition code before utilizing described second redundancy unit with described target devices upgrading writes first redundancy unit again carries out the recovery of target devices.
By upgrade method as above, cut apart the business function ownership and the maintenance function ownership of device in the paired redundant structure to target devices.In the prior art, the business function of device and maintenance function generally all belong to the controller of this redundancy unit in the redundancy unit, and in the embodiment of the invention in each redundancy unit the business function of device ownership at the controller of this redundancy unit, and the maintenance function of this device is belonged on redundant another right redundancy unit, on the controller of concrete example such as this another redundancy unit.The device that is positioned at first redundancy unit so only is the business service of this redundancy unit, if because the soft fault of these devices causes system to occur unusually, that second redundancy unit just can be carried out maintenance function to the device of first redundancy unit.So both improve the reliability of system upgrade, reduced implementation complexity again, and do not increased extra cost.
Embodiment two
Present embodiment proposes the upgrade method of device in a kind of paired redundant structure in addition.In the present embodiment, paired redundant structure is a pair of veneer, is called node A and Node B, and veneer to be upgraded is called this plate, i.e. node A, and another veneer is called plate, i.e. Node B.
Fig. 2 for the redundant centering of present embodiment to the upgrade structural representation of this plate of plate.Data flow when each intra-node and internodal oriented arrow are represented single board updating among Fig. 2, the data stream during normal working of single board do not indicate in the drawings.By among Fig. 2 as can be seen, be positioned at the devices such as BIOS, PLD on the node A, its programming passage is actually by Node B and provides and control, but device such as BIOS, PLD is all served for node A under the normal condition.Therefore from business function, the devices such as PLD on the node A belong to node A, and from soft upgrade function, i.e. maintenance function, and these devices belong to Node B.
As shown in Figure 3, the soft upgrading flow process to target devices among the node A of present embodiment comprises the steps:
Step 310, the long-range reception of Node B can include the code of target devices current version and the code of redaction, i.e. upgrade code to the used code data bag of target devices upgrading of node A in this code data bag.
When the code data of the target devices to be upgraded that does not store node A on the Node B, carry out this step.At this moment, the user can realize long-range access by the management interface of Node B, wraps and reaches in the Node B by will the upgrade code data of target devices of FTP mode or other modes.
If store the code data of the target devices to be upgraded of node A on the Node B in advance, this step is omitted.
Step 320, alternatively, in the time will upgrading for the target devices of node A, but user's remote access node A closes professional notice for node A assigns.
In the present embodiment, described target devices can be system modules such as BIOS, also can be programmable logic device (PLD) (PLD), and this programmable logic device (PLD) comprises complicated programmable logic device (PLD) (CPLD).
Step 330, Node B by and node A between communication interface send the message of upgrading target devices to node A, with the target devices of notice node A on will upgrade node A, as PLD device (comprising the CPLD device).
When the target devices of node A was upgraded, the user can assign the order of target devices of upgrading A to Node B by management interface, and this order is sent to node A by the communication interface of Node B and node A, notified node A target devices on will upgrade node A.
Step 340, node A response is closed the business of moving from the message of Node B, sends after all business are closed and finished to reply (ACK) message to Node B.
Node A sends ACK message to Node B, ready before the expression upgrading.
After step 350, Node B are received the ready ACK message of node A, upgrade code is write the target devices of ingress A, and notify node A to restart.
Node B can by with the programming passage of node A, the data layout of accepting according to target devices writes target devices with upgrade code.For example Node B is written to the upgrade code data in the CPLD device of node A according to JTAG predetermined data form by JTAG programming passage.
Node B is after confirming that upgrade code is write the target devices of ingress A fully, and notice node A is restarted so that upgrade and come into force.For example, Node B can be by the programming passage, and as JTAG programming passage, whether the code of verification upgrading writes and finish, and confirms to write the back that finishes and restarts by communication interface command node A so that upgrade and come into force.
Step 360 is checked the node A success of whether upgrading.
Under the normal condition, node A is restarted the back and is moved with new version of code.If but it is imperfect or contain problems such as wrong to run into the upgrade code that writes, then can exist Node B to think that node A has upgraded successfully and the in fact node A success of not upgrading, promptly after node A is restarted, promptly hang dead.Therefore, in the setting-up time after notice node A is restarted, Node B will be checked the node A success of whether upgrading.
Check in setting-up time whether successful step specifically can be in node A upgrading: Node B adopts an overtime timing strategy, promptly start one to plate upgrading overtime timer, for example timing is 5 minutes, and sends the one query order every some seconds to node A, with the state of query node A.If node A is in operation, then node A can be to Node B echo reply (ACK) message after receiving the querying command of Node B.Under the normal condition, node A sends the ACK response message to Node B before can arriving in the time that overtime timer is set, and this just proves upgrades successfully.If the time interior nodes A that sets at overtime timer does not reply, can prove that so node A hangs in the dust after restarting.
Therefore Node B is checked the node A success of whether upgrading in the following way in this step: if receive the response message that node A returns within the time (as 5 minutes) that overtime timer is set, prove the node A success of having upgraded so.If node A does not respond within during this period of time, that proof goes up the not success of upgrading of target devices to node A.The time that overtime timer is set can normally be closed for system, be restarted the maximum time to normal operating conditions.
Step 380 gets nowhere if node A goes up the upgrading of target devices, and the version before Node B is upgraded described target devices is write this target devices of ingress A again, carries out the recovery of target devices.
For example, get nowhere, then directly enter force revert target devices program, the old code before the upgrading is write target devices again, force reset node A then if Node B affirmation node A goes up the upgrading of target devices.
Step 390 alternatively, can adopt overtime timing strategy behind reset node A, promptly start overtime timer, checks whether target devices recovers success.
Step 400, if recover unsuccessful, execution in step 380 once more, carry out the recovery of target devices.If recover still not response of several times (as 2 times) posterior nodal point A, then execution in step 410.
Step 410, Node B can be alarmed to the user by management interface.
In abovementioned steps 390, recover successfully if check target devices, then Node B execution in step 350 is once more carried out the upgrading of target devices described in the node A again.Also comprise step 370 this moment before the step 380, to limit the number of times of upgrading.
Step 370 judges that whether the upgrading number of times is greater than setting value, as 2.If greater than 2, expression upgrading 2 times is that all right merit, then illustrative system have can not soft reparation fault, this moment, Node B can send a warning message to the user by management interface, personnel go to maintenance with notification technique.
If node A upgrades successfully, can be according to step upgrade node B same as described above, only this moment, this version was a Node B, then was node A to plate.Do not give unnecessary details at this.
The as above redundant centering of the embodiment of the invention is to the upgrade method of this plate of plate, business function ownership and the maintenance function ownership of having cut apart redundant centering device, usually all two functions that belong to a controller (as the CPU of node A) are belonged to redundant right this plate now respectively and to two of plate independently on the controller (as the CPU that belongs to node A respectively and the controller of Node B), the device that is positioned at this plate only is this plate business service, if because that the soft fault of these devices causes this plate system to occur is unusual, another operate as normal just can carry out maintenance function to the controller on the plate to the device of this plate.Can significantly improve redundant reliability like this, reduce implementation complexity again and do not increased extra cost upgrading.
And, the upgrade method to device of the embodiment of the invention, be not limited only between the redundant right veneer and veneer, also be equally applicable to become redundant right module and the interoperability between the module on the veneer, this will improve the robustness of module and module largely.
One of ordinary skill in the art will appreciate that all or part of step that realizes in the foregoing description method can instruct relevant hardware to finish by program, this program can be stored in the computer read/write memory medium, such as ROM/RAM, magnetic disc, CD etc.
Embodiment 3
Present embodiment provides a kind of updating apparatus of target devices, and described target devices lays respectively in redundant right first redundancy unit and second redundancy unit with updating apparatus, and as shown in Figure 4, this updating apparatus 400 comprises:
Code writing unit 410 is used for writing upgrade code to described target devices, and notifies described first redundancy unit to restart.
Inspection unit 420 is used for checking the success of whether upgrading of described target devices in described first redundancy unit is restarted the back setting-up time.
Recovery unit 430 is used for when described target devices upgrading is unsuccessful edition code before the described target devices upgrading being write this target devices again, carries out the recovery of target devices.
In another embodiment of the present invention, as shown in Figure 5, this updating apparatus 400 also comprises:
Acquiring unit 440 is used for edition code and described upgrade code the code writing unit obtains the upgrading of described target devices before described target devices writes upgrade code before.
Transmitting element 450 is used for writing the order that described first redundancy unit of upgrade code forward direction sends the upgrading target devices at the code writing unit to described target devices.
In another embodiment of the present invention, described updating apparatus also comprises:
Call unit 460 is used for calling the recovery that described recovery unit carries out target devices, until the number of times that recovers successfully or recover to reach setting when the recovery of described target devices is unsuccessful;
Alarm unit 470 when being used for after recovering to reach the number of times of setting also success, sends warning information.
In the embodiment of the invention, described alarm unit also can connect first inspection unit, is used for sending warning message when the target devices to first redundancy unit carries out several times upgradings and all gets nowhere.
In another embodiment of the present invention, as shown in Figure 6, described inspection unit 420 comprises:
Query unit 610 is used for after described first redundancy unit is restarted, and regularly sends query messages to first redundancy unit;
Confirmation unit 620 if receive the response message of first redundancy unit in the time of setting, is then confirmed to upgrade successfully, otherwise confirms the upgrading failure.
The updating apparatus to target devices of the embodiment of the invention has been cut apart the business function ownership and the maintenance function ownership of device in the paired redundant structure.In the prior art, the business function of device and maintenance function generally all belong to the controller of this redundancy unit in the redundancy unit, and in the embodiment of the invention in each redundancy unit the business function of device ownership at the controller of this redundancy unit, and the maintenance function of this device is belonged on redundant another right redundancy unit, on the controller of concrete example such as this another redundancy unit.The device that is positioned at first redundancy unit so only is the business service of this redundancy unit, if because the soft fault of these devices causes system to occur unusually, that second redundancy unit just can be carried out maintenance function to the device of first redundancy unit.Can reduce implementation complexity like this, significantly improve redundant reliability and but do not increase extra cost upgrading.
Above-described specific embodiment; purpose of the present invention, technical scheme and beneficial effect are further described; institute is understood that; the above only is specific embodiments of the invention; and be not intended to limit the scope of the invention; within the spirit and principles in the present invention all, any modification of being made, be equal to replacement, improvement etc., all should be included within protection scope of the present invention.