CN103049344B - The method and apparatus of hardware plug fault-tolerant processing - Google Patents

The method and apparatus of hardware plug fault-tolerant processing Download PDF

Info

Publication number
CN103049344B
CN103049344B CN201210504185.0A CN201210504185A CN103049344B CN 103049344 B CN103049344 B CN 103049344B CN 201210504185 A CN201210504185 A CN 201210504185A CN 103049344 B CN103049344 B CN 103049344B
Authority
CN
China
Prior art keywords
hardware
place
state
firmware program
exist
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201210504185.0A
Other languages
Chinese (zh)
Other versions
CN103049344A (en
Inventor
蒋凡璐
余博伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu Shengri Machinery Equipment Manufacturing Co ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN201210504185.0A priority Critical patent/CN103049344B/en
Publication of CN103049344A publication Critical patent/CN103049344A/en
Application granted granted Critical
Publication of CN103049344B publication Critical patent/CN103049344B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Debugging And Monitoring (AREA)

Abstract

The invention provides the method and apparatus of a kind of hardware plug fault-tolerant processing.Whether the method comprises: have no progeny in appearance is abnormal, determine describedly extremely to interrupt because access hardware resource causes; Determine described abnormal interrupt be because of access hardware resource cause time, determine the state causing the described abnormal hardware interrupted; When being improper state in place determining to cause the state of the described abnormal hardware interrupted, execute exception interrupts recovery operation.The method and apparatus of the hardware plug fault-tolerant processing of the embodiment of the present invention can avoid system reboot and the suspension of service.

Description

The method and apparatus of hardware plug fault-tolerant processing
Technical field
The present invention relates to the communications field, and more specifically, relate to the method and apparatus of a kind of hardware plug fault-tolerant processing.
Background technology
Along with the development of hot plug technology, when can be implemented in hardware dilatation/replacement, non-interrupting service, or when changing failing single board, non-interrupting service.
In order to specification user hot plug hardware, every money equipment all can formulate hot plug specification.But, still may there is the phenomenon of barbarous plug hardware, that is, not according to standard operation plug hardware, such as, before not pointing out and can extracting, extract hardware.What also likely exist is, hardware once fell electricity, or client is the plug according to correctly extracting flow performing hardware, but CPU (central processing unit) (CenterProcessingUnit, be called for short CPU) do not perceive in time, therefore also in continuation access hardware resource etc.It is abnormal that above situation all can cause CPU to produce, and needs to restart and could recover, or can stop the task of being performed.
Summary of the invention
Embodiments provide the method for a kind of hardware plug fault-tolerant processing, system reboot or the suspension of service can be avoided.
First aspect, provides the method for a kind of hardware plug fault-tolerant processing, comprising: have no progeny in appearance is abnormal, determines the interruption of this exception whether because access hardware resource causes; Determine this exception interrupt be because of access hardware resource cause time, determine the state of the hardware causing this exception to interrupt, wherein, the state of this hardware comprises normal state in place and improper state in place; When the state determining the hardware causing this exception to interrupt is improper state in place, execute exception interrupts recovery operation.
In conjunction with first aspect, in the first possible implementation of first aspect, this execute exception is interrupted recovery operation and is comprised: preserve the interruption context when interruption of this exception produces; Skip the instruction of access hardware resource error, and recover the interruption context when interruption of this exception produces.
In conjunction with the first possible implementation of first aspect or first aspect, in the implementation that the second of first aspect is possible, this determines the state of the hardware causing this exception to interrupt, and comprising: the mark in place detecting this hardware; When the mark instruction in place of this hardware is not in place, determine that the state of this hardware is improper state in place.
In conjunction with first aspect, the first possible implementation of first aspect or the possible implementation of the second of first aspect, in the third possible implementation of first aspect, this determines the state of the hardware causing this exception to interrupt, and comprising: whether the firmware program detected in this hardware exists; When this firmware program does not exist, determine that the state of this hardware is improper state in place.
In conjunction with the third possible implementation of first aspect, in the 4th kind of possible implementation of first aspect, this determines whether this firmware program exists, comprise: when the mark instruction in place data not in place or this firmware program of this firmware program can not normally be read, determine that this firmware program does not exist.
Second aspect, provide the method for a kind of hardware plug fault-tolerant processing, comprising: detect hardware according to predeterminated frequency, and determine the state of this hardware according to testing result, wherein, the state of this hardware comprises normal state in place and improper state in place; When the state determining this hardware is improper state in place, performs hardware and extract flow process.
In conjunction with second aspect, in the first possible implementation of second aspect, this detects hardware, and determines the state of this hardware according to testing result, comprising: the mark in place detecting this hardware;
When the mark instruction in place of this hardware is not in place, determine that this hardware is improper state in place.
In conjunction with the first possible implementation of second aspect or second aspect, in the implementation that the second of second aspect is possible, this detects hardware, and determines the state of this hardware according to testing result, comprising: detecting firmware program in this hardware no is exist; When this firmware program does not exist, determine that the state of this hardware is improper state in place.
In conjunction with the implementation that the second of second aspect is possible, in the third possible implementation of second aspect, when the firmware program in this first firmware does not exist, the method also comprises: after flow process extracted by this hardware of execution, start timer; After this timer expiry, if when the mark in place of this hardware still indicates in place, then perform hardware and insert flow process.
In conjunction with the third possible implementation of second aspect, in the 4th kind of possible implementation of second aspect, the method also comprises: when executing this hardware and inserting flow process, whether the firmware detected in this hardware exists; When this firmware program does not exist, perform hardware and extract flow process.
In conjunction with the implementation that the second of second aspect is possible, the third possible implementation of second aspect, 4th kind of possible implementation of second aspect, in the 5th kind of possible implementation of second aspect, this determines whether this firmware program exists, comprise: not in place in the mark instruction in place of this firmware program, or when the data of this firmware program can not normally be read, determine that this firmware program does not exist.
The third aspect, provides the device of a kind of hardware plug fault-tolerant processing, comprising: the first determining unit, for having no progeny in appearance is abnormal, determines the interruption of this exception whether because access hardware resource causes; Second determining unit, for determine in this first determining unit this exception interrupt be cause because of access hardware resource time, determine the state of the hardware causing this exception to interrupt, wherein, the state of this hardware comprises normal state in place and improper state in place; Performance element, for when the state that this second determining unit determines the hardware causing this exception to interrupt is improper state in place, execute exception interrupts recovery operation.
In conjunction with the third aspect, in the first possible implementation of the third aspect, this performance element, comprising: preserve subelement, for preserving the interruption context when interruption of this exception produces; Recovering subelement, for skipping the instruction of access hardware resource error, and recovering the interruption context when interruption of this exception produces.
In conjunction with the first possible implementation of the third aspect or the third aspect, in the implementation that the second of the third aspect is possible, this second determining unit, comprising: the first detection sub-unit, for detecting the mark in place of this hardware; First determines subelement, for when the mark instruction in place of this hardware is not in place, determines that the state of this hardware is improper state in place.
In conjunction with the third aspect, the first possible implementation of the third aspect or the possible implementation of the second of the third aspect, in the third possible implementation of the third aspect, this second determining unit, comprise: the second detection sub-unit, whether exist for the firmware program detected in this hardware; Second determines subelement, for when this firmware program does not exist, determines that the state of this hardware is improper state in place.
In conjunction with the third possible implementation of the third aspect, in the 4th kind of possible implementation of the third aspect, this second detection sub-unit, specifically for: not in place in the mark instruction in place of this firmware program, or the data of this firmware program are when can not normally be read, determine that this firmware program does not exist.
Fourth aspect, provides the device of a kind of hardware plug fault-tolerant processing, comprising: detecting unit, for detecting hardware according to predeterminated frequency, and the state of this hardware is determined according to testing result, wherein, the state of this hardware comprises normal state in place and improper state in place; First performance element, during for determining that at this detecting unit the state of this hardware is improper state in place, performing hardware and extracting flow process.
In conjunction with fourth aspect, in the first possible implementation of fourth aspect, this detecting unit, comprising: the first detection sub-unit, for detecting the mark in place of this hardware; First determines subelement, for when the mark instruction in place of this hardware is not in place, determines that this hardware is improper state in place.
In conjunction with the first possible implementation of fourth aspect or fourth aspect, in the implementation that the second of fourth aspect is possible, this detecting unit, comprising: the second detection sub-unit, is exist for detecting firmware program in this hardware no; Second determines subelement, for when this firmware program does not exist, determines that the state of this hardware is improper state in place.
In conjunction with the implementation that the second of fourth aspect is possible, in the third possible implementation of fourth aspect, when firmware program in this first firmware does not exist, this device also comprises: timer units, for execute at this first performance element this hardware extract flow process time, start timer; Second performance element, for after this timer expiry, if the mark instruction in place state in place of this hardware, then performs hardware and inserts flow process.
In conjunction with the third possible implementation of fourth aspect, in the 4th kind of possible implementation of fourth aspect, this second detection sub-unit also for: this second performance element execute this hardware insert flow process time, whether the firmware detected in this hardware exists; This first performance element, also for detect in this second detection sub-unit this firmware program do not exist time, perform hardware extract flow process.
In conjunction with the implementation that the second of fourth aspect is possible, the third possible implementation of fourth aspect, 4th kind of possible implementation of fourth aspect, in the 5th kind of possible implementation of fourth aspect, this second detection sub-unit, specifically for: when the mark instruction in place data not in place or this firmware program of this firmware program can not normally be read, determine that this firmware program does not exist.
Therefore, in embodiments of the present invention, by having no progeny in appearance is abnormal, determine that this exception is interrupted whether because access hardware resource causes, determine this exception interrupt be because of access hardware resource cause time, determine the state of the hardware causing this exception to interrupt, and when the state determining the hardware causing this exception to interrupt is improper state in place, execute exception interrupts recovery operation, can user be avoided to plug hardware improper, or hardware once fell electricity, or firmware is lost and the system reboot that causes or the suspension of service.
Accompanying drawing explanation
In order to be illustrated more clearly in the technical scheme of the embodiment of the present invention, be briefly described to the accompanying drawing used required in the embodiment of the present invention below, apparently, accompanying drawing described is below only some embodiments of the present invention, for those of ordinary skill in the art, under the prerequisite not paying creative work, other accompanying drawing can also be obtained according to these accompanying drawings.
Fig. 1 is the indicative flowchart plugging the method for fault-tolerant processing according to the hardware of the embodiment of the present invention.
Fig. 2 is the indicative flowchart interrupting the method for recovery operation according to the execute exception of the embodiment of the present invention.
Fig. 3 is the indicative flowchart of the method for hardware plug fault-tolerant processing according to another embodiment of the present invention.
Fig. 4 is the indicative flowchart of the method for hardware plug fault-tolerant processing according to another embodiment of the present invention.
Fig. 5 is the schematic block diagram plugging the device of fault-tolerant processing according to the hardware of the embodiment of the present invention.
Fig. 6 is the schematic block diagram of the device according to another embodiment of the present invention hardware plug fault-tolerant processing.
Fig. 7 is the schematic block diagram of the device of hardware plug fault-tolerant processing according to another embodiment of the present invention.
Fig. 8 is the schematic block diagram of the device according to another embodiment of the present invention hardware plug fault-tolerant processing.
Fig. 9 is the schematic block diagram of the device according to another embodiment of the present invention hardware plug fault-tolerant processing.
Figure 10 is the schematic block diagram of the device according to another embodiment of the present invention hardware plug fault-tolerant processing.
Figure 11 is the schematic block diagram of the system that the method plugging fault-tolerant processing according to the hardware of the embodiment of the present invention is applied.
Figure 12 is the schematic block diagram of the system that the method for hardware plug fault-tolerant processing is according to another embodiment of the present invention applied.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, be clearly and completely described the technical scheme in the embodiment of the present invention, obviously, described embodiment is a part of embodiment of the present invention, instead of whole embodiment.Based on the embodiment in the present invention, the every other embodiment that those of ordinary skill in the art obtain under the prerequisite not making creative work, all should belong to the scope of protection of the invention.
Fig. 1 is the indicative flowchart plugging the method 100 of fault-tolerant processing according to the hardware of the embodiment of the present invention.As shown in Figure 1, the method 100 comprises:
S110, has no progeny in appearance is abnormal, determines the interruption of this exception whether because access hardware resource causes;
S120, determine this exception interrupt be because of access hardware resource cause time, determine the state of the hardware causing this exception to interrupt, wherein, the state of this hardware comprises normal state in place and improper state in place, and this improper state in place comprises the state not in place of hardware and the in place but abnormal state of affairs of hardware, such as, firmware is lost, and once falls electricity or is once pulled out.
S130, when the state determining the hardware causing this exception to interrupt is improper state in place, execute exception interrupts recovery operation.
Specifically, be pulled out when hardware, the firmware of hardware is dropped, hardware once fell electricity or hardware was once pulled out, and CPU is not when also perceiving the change of the state of hardware, when CPU continues this hardware of access, mistake will be produced, after there is mistake, CPU can judge to have occurred abnormal interruption, when there is abnormal interruption, CPU can judge that this exception to be interrupted whether because access hardware resource causes, if this exception interrupt be not because of access hardware resource cause time, existing abnormality processing flow process can be performed, if this exception interrupt be because of access hardware resource cause time, the state of the hardware causing this exception to interrupt can be judged further, if when the state of this hardware is normal state in place, then can according to existing abnormal flow processing, if be improper state in place, then execute exception can interrupt recovery operation, to avoid system reboot or the suspension of service.
In embodiments of the present invention, determine the state of the hardware causing this exception to interrupt in S120, can comprise:
Detect the mark in place of described hardware;
When the mark instruction in place of this hardware is not in place, determine that the state of this hardware is improper state in place.
Specifically, if client does not extract this hardware according to normal flow, though CPU does not perceive this hardware and is pulled out, but the mark in place of this hardware can be modified because of extracting of this hardware, namely, indicate this hardware not in place from place the changing to of this hardware of instruction, therefore, CPU can by the state of this hardware of marker for judgment in place of detection hardware, if when the mark instruction in place of this hardware is not in place, directly can judge that this hardware does not exist, thus, determine that the state of this hardware is improper state in place.
In embodiments of the present invention, determine the state of the hardware causing this exception to interrupt in S120, can comprise:
Determine whether the firmware program in this hardware is exist;
When firmware program in this hardware does not exist, determine that the state of this hardware is improper state in place.
Specifically, if when the mark instruction in place of this hardware is in place, can determine that the state of this hardware belongs to normal state in place, but likely occur that hardware exists but the non-existent situation of firmware in hardware, therefore need to judge whether the firmware in this hardware exists, wherein, whether can be existed by the firmware program judging in this firmware and judge whether this firmware exists, in embodiments of the present invention, also likely occur hardware be pulled out after the mark in place of hardware do not refresh in time, though or the situation that the in place but hardware of hardware once may fall electricity or once be pulled out, therefore, whether can exist to judge whether this hardware belongs to normal state in place by firmware (firmware) program on the firmware in detection hardware, if the firmware program on firmware exists, then can judge that the state of this hardware belongs to normal state in place, if this firmware program does not exist, then can judge that the state of this hardware belongs to improper state in place.
In embodiments of the present invention, can determine whether this firmware program exists by the mark in place of firmware program, if when the mark instruction in place of this firmware program is not in place, can determine that this firmware program does not exist, when the mark instruction in place of this firmware program is in place, can determine that this firmware program exists; Can also by reading the firmware program data specific identifier (as version number) in firmware, if can be correctly read, then this firmware program exists, if can not be correctly read, then this firmware program does not exist.
Should be understood that in embodiments of the present invention, after determining that the mark instruction in place of hardware is in place, just can judge whether firmware program exists, also can not judge the mark in place of hardware, directly judge whether this firmware program exists.
Should be understood that in embodiments of the present invention, the firmware in this hardware can be the unique firmware in this hardware, also can be any one in the multiple firmwares in this hardware, also can be the firmware playing major function in this hardware, now, this hardware can be network interface card or hard disk.Though should also be understood that and be described for the hardware comprising firmware above, hardware also can be interpreted as firmware itself by the present invention, and now, described hardware can be understood as the firmware in network interface card or hard disk, and the embodiment of the present invention does not limit this.
Should understand, the method 100 of the hardware plug fault-tolerant processing in the embodiment of the present invention can be performed by CPU, and also can be realized by other control module on mainboard except CPU, the embodiment of the present invention does not limit this, for convenience of description, the present invention is described for CPU.
In embodiments of the present invention, as shown in Figure 2, in S130, execute exception interruption recovery operation can comprise:
S133, preserves the interruption context when interruption of this exception produces;
S136, skips the instruction of access hardware resource error, and recovers the interruption context when interruption of this exception produces.
Specifically, determine that abnormal interruption causes because of access hardware resource at CPU, and when causing hardware that this exception is interrupted to belong to improper state in place, the interruption context when interruption of this exception produces can be preserved, skip the instruction of access hardware resource error (namely, to CPU, because accessing the hardware of improper state in place, the mistake that produces does not carry out preserving flow process and not carrying out restarting of system), and interruption context when recovering exception, thus system reboot or the suspension of service can be avoided.
Should understand, in embodiments of the present invention, the exception that also can perform other interrupts recovery operation, such as when CPU access non-existent hardware resource cause abnormal interruption time, execute exception handling procedure can be continued, hang up current task, due to the exception caused due to access when main hardware does not exist can be confirmed to be, can current task be reopened, continue the operation of system.
Therefore, in embodiments of the present invention, by having no progeny in appearance is abnormal, determine that this exception is interrupted whether because access hardware resource causes, determine this exception interrupt be because of access hardware resource cause time, determine the state of the hardware causing this exception to interrupt, and when the state determining the hardware causing this exception to interrupt is normal state in place, execute exception interrupts recovery operation, can user be avoided to plug hardware improper, or hardware once fell electricity, or firmware is lost and the system reboot that causes or the suspension of service.
Fig. 3 is the indicative flowchart plugging the method 200 of fault-tolerant processing according to the hardware of the embodiment of the present invention.As shown in Figure 3, the method 200 comprises:
S210, detects hardware according to predeterminated frequency, and determines the state of this hardware according to testing result, and wherein, the state of this hardware comprises normal state in place and improper state in place;
S220, when the state determining this hardware is improper state in place, performs hardware and extracts flow process.
Specifically, CPU can detect a certain hardware according to predeterminated frequency, and the state of this hardware is judged according to testing result, if when the state of this hardware is improper state in place (such as, hardware is pulled out, the firmware of hardware is dropped, hardware once fell electricity or hardware was once pulled out), then can perform hardware and extract flow process, in order to avoid occur because this hardware is improper state in place, and the exception occurred when CPU continues this hardware resource of access is interrupted.
In embodiments of the present invention, in S210, hardware is detected, and determines the state of this hardware according to testing result, can comprise:
Detect the mark in place of this hardware;
When the mark instruction in place of this hardware is not in place, determine that this hardware is improper state in place.
Specifically, if client does not extract this hardware according to normal flow, though CPU does not perceive this hardware and is pulled out, but the mark in place of this hardware can be modified because of extracting of this hardware, namely, indicate hardware not in place from place the changing to of instruction hardware, therefore, CPU can by the state of this hardware of marker for judgment in place of detection hardware, if when the mark instruction in place of this hardware is not in place, directly can judge that this hardware does not exist, thus the state of this hardware is improper state in place.
In embodiments of the present invention, S210 detects this to hardware, determines the state of this hardware according to testing result, can comprise:
Determine whether the firmware program in this hardware exists;
When this firmware program does not exist, determine that the state of this hardware is improper state in place.
Specifically, if when the mark instruction in place of this hardware is in place, can determine that the state of this hardware belongs to normal state in place, but likely occur that hardware exists but the non-existent situation of firmware in hardware, therefore need to judge whether the firmware in this hardware exists, wherein, whether can be existed by the firmware program judging in this firmware and judge whether this firmware exists; In embodiments of the present invention, also likely occur hardware be pulled out after the mark in place of hardware do not refresh in time, though or the situation that the in place but hardware of hardware once may fall electricity or once be pulled out, therefore, whether can exist to judge whether this hardware belongs to normal state in place by firmware (firmware) program in detection hardware, if firmware program exists, then can judge that the state of this hardware belongs to normal state in place, if firmware program does not exist, then can judge that the state of this hardware belongs to improper state in place.
In embodiments of the present invention, can determine whether this firmware program exists by the mark in place of firmware program, if the mark instruction in place of this firmware program is not in place, can determine that this firmware program does not exist, when the mark instruction in place of this firmware program is in place, can determine that this firmware program exists; Can also judge whether this firmware program exists by reading firmware program data specific identifier (as version number), if can be correctly read, then this firmware program exists, if can not be correctly read, then this firmware program does not exist.
Should be understood that in embodiments of the present invention, after determining that the mark instruction in place of hardware is in place, just can judge whether this firmware program exists, also can not judge the mark in place of hardware, directly judge whether this firmware program exists.
In embodiments of the present invention, as shown in Figure 4, when this firmware program does not exist, described method 200 can also comprise:
S230, when executing described hardware and extracting flow process, starts timer;
S240, when this timer expiry, if the mark instruction in place of this hardware is in place, then performs hardware and inserts flow process.
In embodiments of the present invention, because the state improper in place of hardware likely caused because hardware once falls electricity or was once pulled out, therefore, after this hardware is performed and extracts flow process, the log-on message of not this hardware in main frame, and cause the phenomenon that this hardware can not be used properly, after flow process can being extracted executing hardware, start timer (in place being marked in this timer duration of not having enough time to refresh is refreshed), if after timer expiry, if the mark in place of this hardware still indicates in place, then can determine that this hardware is in place, therefore, hardware can be performed and insert flow process, rationally applied to make this hardware resource.
Further, because the state abnormal in place of this hardware likely causes (in such cases because firmware loses, after timer expiry, the mark in place of hardware still indicates state in place), after flow process can being inserted executing hardware, judge whether the firmware program in this hardware exists, if there is no, then can again perform hardware and extract flow process.
Should be understood that in embodiments of the present invention, the firmware in this hardware can be the unique firmware in this hardware, also can be any one in the multiple firmwares in this hardware, also can be the firmware playing major function in this hardware, now, this hardware can be network interface card or hard disk.Though should also be understood that and be described for the hardware comprising firmware above, hardware also can be interpreted as firmware itself by the present invention, and now, described hardware can be understood as the firmware in network interface card or hard disk, and the embodiment of the present invention does not limit this.
Will also be understood that, in embodiments of the present invention, CPU performs poll, and to detect the predeterminated frequency adopted can be a constant frequency, also can be the frequency changed along with the time, such as, the frequency that can be set in when detecting daytime is greater than frequency when detecting night, and specifically can determine according to actual conditions, the embodiment of the present invention does not limit this.
In embodiments of the present invention, CPU performs and detects the frequency adopted can be level second, uses the frequency that second, level was unit that poll detection can be avoided to take a large amount of cpu resources, also can detect the state of hardware in time.
Therefore, in embodiments of the present invention, according to predeterminated frequency, hardware is detected, and determine the state of this hardware according to testing result, and when the state determining this hardware is improper state in place, performs hardware and extract flow process, thus, system reboot or the suspension of service can be avoided.
Below composition graphs 1 to Fig. 4 describes the method for the hardware plug fault-tolerant processing according to the embodiment of the present invention, describes the device according to the hardware fault-tolerant process of the embodiment of the present invention below with reference to Fig. 5 to Figure 10.
Fig. 5 shows the schematic block diagram of the device of a kind of hardware plug fault-tolerant processing according to the embodiment of the present invention.As shown in Figure 5, device 300 comprises:
First determining unit 310, for having no progeny in appearance is abnormal, determines the interruption of this exception whether because access hardware resource causes;
Second determining unit 320, for determine in this first determining unit 310 this exception interrupt be cause because of access hardware resource time, determine the state of the hardware causing this exception to interrupt, wherein, the state of this hardware comprises normal state in place and improper state in place;
Performance element 330, for when the state that this second determining unit 320 determines the hardware causing this exception to interrupt is improper state in place, execute exception interrupts recovery operation.
Alternatively, as shown in Figure 6 and Figure 7, this performance element 330, comprising:
Preserve subelement 332, for preserving the interruption context when interruption of this exception produces;
Recovering subelement 334, for skipping the instruction of access hardware resource error, and recovering the interruption context when interruption of this exception produces.
Alternatively, as shown in Figure 6, this second determining unit 320, comprising:
First detection sub-unit 322, for detecting the mark in place of this hardware;
First determines subelement 324, for when the mark instruction in place of this hardware is not in place, determines that the state of this hardware is improper state in place.
Alternatively, as shown in Figure 7, this second determining unit 320, comprising:
Whether the second detection sub-unit 326, exist for the firmware program detected in this hardware;
Second determines subelement 328, for when this firmware program does not exist, determines that the state of this hardware is improper state in place.
Alternatively, this second detection sub-unit 328, specifically for:
Not in place in the mark instruction in place of this firmware program, or when the data of this firmware program can not normally be read, determine that this firmware program does not exist.
Therefore, in embodiments of the present invention, by having no progeny in appearance is abnormal, determine that this exception is interrupted whether because access hardware resource causes, determine this exception interrupt be because of access hardware resource cause time, determine the state of the hardware causing this exception to interrupt, and when the state determining the hardware causing this exception to interrupt is improper state in place, execute exception interrupts recovery operation, can user be avoided to plug hardware improper, or hardware once fell electricity, or firmware is lost and the system reboot that causes or the suspension of service.
Fig. 8 is the schematic block diagram plugging the device 400 of fault-tolerant processing according to the hardware of the embodiment of the present invention.As shown in Figure 8, this device 400 comprises:
Detecting unit 410, for detecting hardware according to predeterminated frequency, and determines the state of this hardware according to testing result, wherein, the state of this hardware is included in normal position state and improper state in place;
First performance element 420, during for determining that at this detecting unit the state of this hardware is improper state in place, performing hardware and extracting flow process.
Alternatively, as shown in Figure 9, this detecting unit 410, comprising:
First detection sub-unit 412, for detecting the mark in place of this hardware;
First determines subelement 414, for when the mark instruction in place of this hardware is not in place, determines that this hardware is improper state in place.
Alternatively, as shown in Figure 10, this detecting unit 410, comprising:
Whether the second detection sub-unit 416, exist for the firmware program detected in this hardware;
Second determines subelement 418, for when this firmware program does not exist, determines that the state of this hardware is improper state in place.
Alternatively, this second detection sub-unit 416, specifically for:
Not in place in the mark instruction in place of this firmware program, or when the data of this firmware program can not normally be read, determine that this firmware program does not exist.
Alternatively, as shown in Figure 10, when this firmware program does not exist, this device 400 also comprises:
Timer units 430, for performing after this hardware extracts flow process at this first performance element 420, starts timer;
Second performance element 440, for when this timer expiry, if the mark instruction in place of this hardware is in place, then performs hardware and inserts flow process.
Alternatively, in embodiments of the present invention, described second detection sub-unit 416 also for: described second performance element execute described hardware insert flow process time, whether the firmware program detected in described hardware exists; Described first performance element 418, also for detect in described second detection sub-unit described firmware program do not exist time, perform hardware extract flow process.
Therefore, in embodiments of the present invention, according to predeterminated frequency, hardware is detected, and determine the state of this hardware according to testing result, and when the state determining this hardware is improper state in place, performs hardware and extract flow process, thus, system reboot or the suspension of service can be avoided.
Below composition graphs 5 to Figure 10 describes the device of the hardware plug fault-tolerant processing according to the embodiment of the present invention, will describe the system applied according to the method 100 of the hardware plug fault-tolerant processing of the embodiment of the present invention and the method 200 of hardware plug fault-tolerant processing respectively in conjunction with Figure 11 and Figure 12.
Figure 11 shows the system 500 applied according to the method for the hardware plug fault-tolerant processing of the embodiment of the present invention.As shown in figure 11, in this system 500, can comprise processor 501, hardware 502, storer 503 and bus 504, each equipment in system 500 is connected by bus 504.Wherein,
This processor 501, for having no progeny in appearance is abnormal, determine that this exception is interrupted whether because access hardware resource causes, and determine this exception interrupt be because of access hardware resource cause time, determine the state of the hardware 502 causing this exception to interrupt, wherein, the state of this hardware 502 comprises normal state in place and improper state in place, and when the state determining the hardware 502 causing this exception to interrupt is improper state in place, execute exception interrupts recovery operation.
Alternatively, this processor 501 specifically for: by this exception interrupt produce time interruption context be kept in storer 503; Skip the instruction of access hardware resource error, and interruption context when recovering this exception interruption generation in this storer 503.
Alternatively, this processor 501 is specifically for the mark in place that detects this hardware 502; When the mark instruction in place of this hardware 502 is not in place, determine that the state of this hardware 502 is improper state in place.
Alternatively, this treatment tool 501 body is used for: whether the firmware program detected in this hardware 502 exists; When this firmware program does not exist, determine that the state of this hardware 502 is improper state in place.
Alternatively, this processor 501 specifically for: not in place in the mark in place instruction of this firmware program, or when the data of this firmware program can not normally be read, determine that this firmware program does not exist.
Therefore, in embodiments of the present invention, by having no progeny in appearance is abnormal, determine that this exception is interrupted whether because access hardware resource causes, determine this exception interrupt be because of access hardware resource cause time, determine the state of the hardware causing this exception to interrupt, and when the state determining the hardware causing this exception to interrupt is improper state in place, execute exception interrupts recovery operation, can user be avoided to plug hardware improper, or hardware once fell electricity, or firmware is lost and the system reboot that causes or the suspension of service.
Figure 12 shows the system 600 applied according to the hardware plug fault-tolerant processing of the embodiment of the present invention.As shown in figure 11, in this system 600, can comprise processor 601, hardware 602 and bus 603, each equipment in system 600 is connected by bus 603.Wherein,
This processor 601 for: according to predeterminated frequency, hardware 602 is detected, and the state of this hardware 602 is determined according to testing result, wherein, the state of this hardware 602 comprises normal state in place and improper state in place, when the state determining this hardware 602 is improper state in place, performs hardware and extract flow process.
Alternatively, this processor 601 is specifically for the mark in place that detects this hardware 602; This hardware 602 mark in place instruction not state in place time, determine that the state of this hardware 602 is improper state in place.
Alternatively, this processor 601 specifically for: whether the firmware program detected in this hardware 602 exists; When this firmware program does not exist, determine that the state of this hardware 602 is improper state in place.
Alternatively, this processor 601 also for: when executing this hardware and extracting flow process, start timer; When this timer expiry, if the mark instruction in place of this hardware 602 is in place, then performs hardware and insert flow process.
Alternatively, this processor 601 also for: execute this hardware insert flow process time, whether the firmware program detected in this hardware 602 exists; When this firmware program does not exist, perform hardware and extract flow process.
Alternatively, this processor 601 specifically for: not in place in the mark in place instruction of this firmware program, or when the data of this firmware program can not normally be read, determine that this firmware program does not exist.
Therefore, in embodiments of the present invention, according to predeterminated frequency, hardware is detected, and determine the state of this hardware according to testing result, and when the state determining this hardware is improper state in place, performs hardware and extract flow process, thus, system reboot or the suspension of service can be avoided.
Those of ordinary skill in the art can recognize, in conjunction with the various method steps described in embodiment disclosed herein and unit, can realize with electronic hardware, computer software or the combination of the two, in order to the interchangeability of hardware and software is clearly described, generally describe step and the composition of each embodiment in the above description according to function.These functions perform with hardware or software mode actually, depend on application-specific and the design constraint of technical scheme.Those of ordinary skill in the art can use distinct methods to realize described function to each specifically should being used for, but this realization should not thought and exceeds scope of the present invention.
The software program that the method described in conjunction with embodiment disclosed herein or step can use hardware, processor to perform, or the combination of the two is implemented.Software program can be placed in the storage medium of other form any known in random access memory (RAM), internal memory, ROM (read-only memory) (ROM), electrically programmable ROM, electrically erasable ROM, register, hard disk, moveable magnetic disc, CD-ROM or technical field.
Although by reference to accompanying drawing and mode in conjunction with the preferred embodiments to invention has been detailed description, the present invention is not limited to this.Without departing from the spirit and substance of the premise in the present invention, those of ordinary skill in the art can carry out amendment or the replacement of various equivalence to embodiments of the invention, and these amendments or replacement all should in covering scopes of the present invention.

Claims (18)

1. a method for hardware plug fault-tolerant processing, is characterized in that, comprising:
Have no progeny in appearance is abnormal, determine describedly extremely to interrupt whether because access hardware resource causes;
Determine described abnormal interrupt be because of access hardware resource cause time, determine the state causing the described abnormal hardware interrupted, wherein, the state of described hardware comprises normal state in place and improper state in place;
When being improper state in place determining to cause the state of the described abnormal hardware interrupted, execute exception interrupts recovery operation.
2. method according to claim 1, is characterized in that, described execute exception is interrupted recovery operation and comprised:
Preserve the described abnormal interruption context interrupted when producing;
Skip the instruction of access hardware resource error, and recover the described abnormal interruption context interrupted when producing.
3. method according to claim 1 and 2, is characterized in that, the described state determining the hardware causing described abnormal interruption, comprising:
Detect the mark in place of described hardware;
When the mark instruction in place of described hardware is not in place, determine that the state of described hardware is improper state in place.
4. method according to claim 1 and 2, is characterized in that, the described state determining the hardware causing described abnormal interruption, comprising:
Whether the firmware program detected in described hardware exists;
When described firmware program does not exist, determine that the state of described hardware is improper state in place.
5. method according to claim 4, is characterized in that, whether the described firmware program determined in described hardware exists, and comprising:
Not in place in the mark instruction in place of described firmware program, or when the data of described firmware program can not normally be read, determine that described firmware program does not exist.
6. a method for hardware plug fault-tolerant processing, is characterized in that, comprising:
Detect hardware according to predeterminated frequency, and determine the state of described hardware according to testing result, wherein, the state of described hardware comprises normal state in place and improper state in place;
When the state determining described hardware is improper state in place, performs hardware and extract flow process;
Described hardware to be detected, and determines the state of described hardware according to testing result, comprising:
Whether the firmware program detected in described hardware exists;
When described firmware program does not exist, determine that the state of described hardware is improper state in place;
When described firmware program does not exist, described method also comprises:
When executing described hardware and extracting flow process, start timer;
When described timer expiry, if the mark instruction in place of described hardware is in place, then performs hardware and insert flow process.
7. method according to claim 6, is characterized in that, describedly detects hardware, and determines the state of described hardware according to testing result, comprising:
Detect the mark in place of described hardware;
When the mark instruction in place of described hardware is not in place, determine that the state of described hardware is improper state in place.
8. method according to claim 6, is characterized in that, described method also comprises:
When executing described hardware and inserting flow process, whether the firmware program detected in described hardware exists;
When described firmware program does not exist, perform hardware and extract flow process.
9. the method according to any one of claim 6 to 8, is characterized in that, whether the described firmware program of described detection exists, and comprising:
Not in place in the mark instruction in place of described firmware program, or when the data of described firmware program can not normally be read, determine that described firmware program does not exist.
10. a device for hardware plug fault-tolerant processing, is characterized in that, comprising:
Whether the first determining unit, for having no progeny in appearance is abnormal, determine describedly extremely to interrupt because access hardware resource causes;
Second determining unit, for determine in described first determining unit described abnormal interrupt be cause because of access hardware resource time, determine the state causing the described abnormal hardware interrupted, wherein, the state of described hardware comprises normal state in place and improper state in place;
Performance element, during for determining to cause the state of the described abnormal hardware interrupted to be improper state in place in described second determining unit, execute exception interrupts recovery operation.
11. devices according to claim 10, is characterized in that, described performance element comprises:
Preserve subelement, for preserving the described abnormal interruption context interrupted when producing;
Recovering subelement, for skipping the instruction of access hardware resource error, and recovering the described abnormal interruption context interrupted when producing.
12. devices according to claim 10 or 11, it is characterized in that, described second determining unit comprises:
First detection sub-unit, for detecting the mark in place of described hardware;
First determines subelement, for when the mark instruction in place of described hardware is not in place, determines that the state of described hardware is improper state in place.
13. devices according to claim 10 or 11, it is characterized in that, described second determining unit, comprising:
Whether the second detection sub-unit, exist for the firmware program detected in described hardware;
Second determines subelement, for when described firmware program does not exist, determines that the state of described hardware is improper state in place.
14. devices according to claim 13, is characterized in that, described second detection sub-unit, specifically for:
Not in place in the mark instruction in place of described firmware program, or when the data of described firmware program can not normally be read, determine that described firmware program does not exist.
The device of 15. 1 kinds of hardware plug fault-tolerant processing, is characterized in that, comprising:
Detecting unit, for detecting hardware according to predeterminated frequency, and determines the state of described hardware according to testing result, wherein, the state of described hardware comprises state in place and non-state in place;
First performance element, during for determining that at described detecting unit the state of described hardware is improper state in place, performing hardware and extracting flow process;
Described detecting unit, comprising:
Whether the second detection sub-unit, exist for the firmware program detected in described hardware;
Second determines subelement, for when described firmware program does not exist, determines that the state of described hardware is improper state in place;
When described firmware program does not exist, described device also comprises:
Timer units, for execute at described first performance element described hardware extract flow process time, start timer;
Second performance element, for when described timer expiry, if the mark instruction in place of described hardware is in place, then performs hardware and inserts flow process.
16. devices according to claim 15, is characterized in that, described detecting unit, comprising:
First detection sub-unit, for detecting the mark in place of described hardware;
First determines subelement, for when the mark instruction in place of described hardware is not in place, determines that the state of described hardware is improper state in place.
17. devices according to claim 16, is characterized in that,
Described second detection sub-unit also for: described second performance element execute described hardware insert flow process time, whether the firmware program detected in described hardware exists;
Described first performance element also for: described second detection sub-unit detect described firmware program do not exist time, perform hardware extract flow process.
18. devices according to claim 17, is characterized in that, described second detection sub-unit specifically for:
Not in place in the mark instruction in place of described firmware program, or when the data of described firmware program can not normally be read, determine that described firmware program does not exist.
CN201210504185.0A 2012-11-30 2012-11-30 The method and apparatus of hardware plug fault-tolerant processing Active CN103049344B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210504185.0A CN103049344B (en) 2012-11-30 2012-11-30 The method and apparatus of hardware plug fault-tolerant processing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210504185.0A CN103049344B (en) 2012-11-30 2012-11-30 The method and apparatus of hardware plug fault-tolerant processing

Publications (2)

Publication Number Publication Date
CN103049344A CN103049344A (en) 2013-04-17
CN103049344B true CN103049344B (en) 2015-12-09

Family

ID=48061993

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210504185.0A Active CN103049344B (en) 2012-11-30 2012-11-30 The method and apparatus of hardware plug fault-tolerant processing

Country Status (1)

Country Link
CN (1) CN103049344B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103309762B (en) * 2013-06-21 2015-12-23 杭州华三通信技术有限公司 Unit exception disposal route and device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN2594868Y (en) * 2002-11-12 2003-12-24 阳庆电子股份有限公司 Hot insertable testing circuit between interface and CPU
CN1529465A (en) * 2003-09-29 2004-09-15 港湾网络有限公司 Hot-plugging detection and treating method
CN101615152A (en) * 2009-07-13 2009-12-30 中兴通讯股份有限公司 The detection method of hot plug fault of storage card and device
CN102739216A (en) * 2012-05-25 2012-10-17 武汉烽火网络有限责任公司 Control device and method for hot plugging and electrifying of line cards of communication equipment

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6223301B1 (en) * 1997-09-30 2001-04-24 Compaq Computer Corporation Fault tolerant memory

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN2594868Y (en) * 2002-11-12 2003-12-24 阳庆电子股份有限公司 Hot insertable testing circuit between interface and CPU
CN1529465A (en) * 2003-09-29 2004-09-15 港湾网络有限公司 Hot-plugging detection and treating method
CN101615152A (en) * 2009-07-13 2009-12-30 中兴通讯股份有限公司 The detection method of hot plug fault of storage card and device
CN102739216A (en) * 2012-05-25 2012-10-17 武汉烽火网络有限责任公司 Control device and method for hot plugging and electrifying of line cards of communication equipment

Also Published As

Publication number Publication date
CN103049344A (en) 2013-04-17

Similar Documents

Publication Publication Date Title
CN100388217C (en) Dynamic threshold scaling method and system in communication system
EP3142011A1 (en) Anomaly recovery method for virtual machine in distributed environment
CN103927239A (en) Method and device for restoring system of terminal equipment
CN103902437A (en) Detecting method and server
CN109738719B (en) Electrostatic discharge ESD detection method and related product
CN105573884A (en) Multi-TF-card plugging and unplugging detection method and system
CN110659159A (en) Service process operation monitoring method, device, equipment and storage medium
CN111694710A (en) Method, device and equipment for monitoring faults of substrate management controller and storage medium
CN106445720A (en) Memory error recovery method and device
CN104615471A (en) System upgrading method and device for terminal
CN106528480A (en) Method and system of preventing hot swapping data from missing, and terminal equipment
CN110445932B (en) Abnormal card dropping processing method and device, storage medium and terminal
CN104010077A (en) Information processing method and electronic equipment
CN114615310A (en) Method and device for maintaining TCP connection and electronic equipment
CN103049344B (en) The method and apparatus of hardware plug fault-tolerant processing
CN103793292A (en) Disaster recovery method for disk array
CN110908947A (en) Hot plug method and device for frame type equipment line card, main control board and frame type equipment
CN103259905B (en) Cell-phone smart recovery method and system
CN101369238A (en) Exception monitoring and reset processing method for USB equipment
CN103595781A (en) Service providing method, first server and system based on zookeeper
CN105208192B (en) A kind of method and device of detection terminal storage card state
CN103793283A (en) Terminal fault handling method and terminal fault handling device
JP2013045464A (en) Method for repairing communication abnormality between data card and host and abnormality of data card
CN106933578A (en) A kind of USB drive load methods of QNX systems
CN101158920A (en) Method and apparatus for detecting fault of operating system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20191231

Address after: 225453 Hongqiao Industrial Park, Taixing, Jiangsu, Taizhou

Patentee after: JIANGSU SHENGRI MACHINERY EQUIPMENT MANUFACTURING Co.,Ltd.

Address before: 518129 Bantian HUAWEI headquarters office building, Longgang District, Guangdong, Shenzhen

Patentee before: HUAWEI TECHNOLOGIES Co.,Ltd.

TR01 Transfer of patent right
PE01 Entry into force of the registration of the contract for pledge of patent right
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: Fault-tolerance processing method and device for plugging hardware

Effective date of registration: 20200319

Granted publication date: 20151209

Pledgee: Bank of China Limited Taixing sub branch

Pledgor: JIANGSU SHENGRI MACHINERY EQUIPMENT MANUFACTURING Co.,Ltd.

Registration number: Y2020980000850

PC01 Cancellation of the registration of the contract for pledge of patent right

Date of cancellation: 20210607

Granted publication date: 20151209

Pledgee: Bank of China Limited Taixing sub branch

Pledgor: JIANGSU SHENGRI MACHINERY EQUIPMENT MANUFACTURING Co.,Ltd.

Registration number: Y2020980000850

PC01 Cancellation of the registration of the contract for pledge of patent right
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: Method and device of fault tolerance processing for hardware plugging

Effective date of registration: 20210608

Granted publication date: 20151209

Pledgee: Bank of China Limited Taixing sub branch

Pledgor: JIANGSU SHENGRI MACHINERY EQUIPMENT MANUFACTURING Co.,Ltd.

Registration number: Y2021980004526

PE01 Entry into force of the registration of the contract for pledge of patent right
PC01 Cancellation of the registration of the contract for pledge of patent right

Date of cancellation: 20220706

Granted publication date: 20151209

Pledgee: Bank of China Limited Taixing sub branch

Pledgor: JIANGSU SHENGRI MACHINERY EQUIPMENT MANUFACTURING Co.,Ltd.

Registration number: Y2021980004526

PC01 Cancellation of the registration of the contract for pledge of patent right
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: Method and device for fault tolerance processing of hardware plugging

Effective date of registration: 20220727

Granted publication date: 20151209

Pledgee: Bank of China Limited Taixing sub branch

Pledgor: JIANGSU SHENGRI MACHINERY EQUIPMENT MANUFACTURING Co.,Ltd.

Registration number: Y2022980011395

PE01 Entry into force of the registration of the contract for pledge of patent right