CN107404393A - A kind of method and system for judging link failure - Google Patents
A kind of method and system for judging link failure Download PDFInfo
- Publication number
- CN107404393A CN107404393A CN201610340855.8A CN201610340855A CN107404393A CN 107404393 A CN107404393 A CN 107404393A CN 201610340855 A CN201610340855 A CN 201610340855A CN 107404393 A CN107404393 A CN 107404393A
- Authority
- CN
- China
- Prior art keywords
- sender
- timeout
- fpga
- cpu
- recipient
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/06—Management of faults, events, alarms or notifications
- H04L41/0677—Localisation of faults
Abstract
The invention discloses a kind of method and system for judging link failure.A kind of method for judging link failure, including:Sender sends the detection messages for carrying first state position by FPGA to recipient, waits the reply message of recipient;If sender receives reply message by FPGA in default time-out time Timeout, judge whether first state position is default normal status value, if, it is determined that communication is normal;If it is not, and determine that it is default abnormal condition value to send preset times detection messages to receive first state position, it is determined that receiver system operation irregularity;If sender does not receive reply message by FPGA in Timeout, it is determined that link is abnormal.The present invention can effectively carry out breakdown judge and accident analysis in cpu fault, and it is receiver system failure rather than link failure that can effectively help the side of detection to judge in receiver system failure state.
Description
Technical field
The present embodiments relate to link failure judgement and link failure analytical technology, more particularly to a kind of judgement
The method and system of link failure.
Background technology
With the development of network and information technology, to ensure that communication receiver's (core net) is reliable and can effectively receive
To the data of sender (base station) passback, sender (base station) needs whether just receiving terminal system can be understood in real time
Often, therefore how to detect opposite equip. state more and more important to data sender.
The communication failure of traditional IP is using transmission ICMPEcho request.
But because the link that message transmits in a network can automatically switch, time delay also in real-time change, therefore
Physical link time delay occurs that erroneous judgement or live effect are bad when uncertain, and ICMP can not enter exactly
Row accident analysis, the processing of message can also take overhead, can seriously be dropped when especially upstream request is too many
Low CPU efficiency, or even make system congestion.
The content of the invention
The present invention provides a kind of method and system for judging link failure, and one kind is provided to be embodied as communicating pair
The method that real-time and quick link failure judges.
In a first aspect, the invention provides a kind of method for judging link failure, including:
Sender sends the detection messages for carrying first state position by FPGA to recipient, waits institute
State the reply message of recipient;
The recipient receives the detection messages by FPGA, according to recipient CPU work shape
State updates the first state position, and replys the reply with the first state position to described sender
Message;
If described sender receives the reply message by FPGA in default time-out time Timeout,
Judge whether the first state position is default normal status value, if, it is determined that communication is normal;If it is not,
And it is to preset improper shape to determine that detection messages described in transmission preset times receive the first state position
State value, it is determined that receiver system operation irregularity;If described sender by FPGA in Timeout not
Receive the reply message, it is determined that link is abnormal.
Preferably, described sender sends a detection for carrying first state position by FPGA to recipient
Message, the reply message of the recipient is waited, is specifically included:Sender is sent out by FPGA to recipient
Send one to carry the detection messages of first state position, and start timing, wait the reply message of the recipient;
If the described sender receives the reply message by FPGA in Timeout, specific bag
Include:Described sender receives the reply message by FPGA in default time-out time Timeout,
Terminate timing and obtain round-trip delay t1, t1 is sent to sender CPU to calculate renewal Timeout, will
Timeout after renewal feeds back sender to FPGA.
Preferably, it is described that t1 is sent to sender CPU to calculate renewal Timeout, after renewal
Timeout feeds back sender to FPGA, is specially:First upper electricity is determine whether by sender CPU
Or link change, if so, t1 then is sent into sender CPU to calculate renewal Timeout, will update
Timeout feedback senders afterwards when link is constant if not, do not update Timeout to FPGA to FPGA
Value.
Preferably, it is described that t1 is sent to sender CPU to calculate renewal Timeout, specifically include:
T1 is sent to sender CPU, is calculated more by the sender CPU δ set according to t1 and user
New Timeout=t1+t1x δ.
Preferably, described sender sends a detection for carrying first state position by FPGA to recipient
Message, specifically include:
Described sender is periodically sent by the predetermined period DetectInterval in FPGA to recipient
One detection messages for carrying first state position.
Preferably, it is described to determine that communication is normal, specifically include:By the letter that the receiver system is working properly
Breath is sent to sender CPU, then puts Timeout timer completely, into next detection cycle;
The determination receiver system operation irregularity, it is specially:By TimeOut timer clear 0, simultaneously
By DetectInterval timer clear 0, the information of the receiver system operation irregularity is sent to transmission
Square CPU;
The determination link is abnormal, is specially:By TimeOut timer clear 0, simultaneously will
DetectInterval timer clear 0, sender CPU is sent to by doubtful link failure information.
Preferably, after the determination link exception, in addition to:Described sender CPU is to the reception
Fang Faqi ICMP echo request messages, pass through obtained error message detecting link abort situation.
Second aspect, the embodiment of the present invention additionally provide a kind of system for judging link failure, and the system includes:
Sender and recipient, described sender include the first FPGA module, the second FPGA module with
And sender CPU, the recipient include the 3rd FPGA module and recipient CPU;
First FPGA module, for sending the detection messages for carrying first state position to recipient, etc.
Treat the reply message of the recipient;
3rd FPGA module, for receiving the detection messages, according to the work of the recipient CPU
Make state and update the first state position, and replied to described sender with described in the first state position
Reply message;
Second FPGA module, for receiving the reply message in default time-out time Timeout,
Judge whether the first state position is default normal status value, if, it is determined that communication is normal;If it is not,
And it is to preset improper shape to determine that detection messages described in transmission preset times receive the first state position
State value, it is determined that receiver system is working properly;If described sender does not receive described in Timeout
Reply message, it is determined that link is abnormal.
Preferably, the first FPGA module, it is specifically used for:One, which is sent, to recipient carries first state position
Detection messages, and start timing, wait the reply message of the recipient;
Second FPGA module, is specifically used for:Received in default time-out time Timeout described
Message is replied, terminates timing and obtains round-trip delay t1, t1 is sent to sender CPU, judge described the
Whether one mode bit is default normal status value, if, it is determined that communication is normal;If it is not, and determine to send
It is default abnormal condition value that detection messages described in preset times, which receive the first state position, it is determined that
Receiver system is working properly;If described sender does not receive the reply message in Timeout,
Determine link exception;;
Described sender CPU, Timeout is updated for being calculated according to t1, the Timeout after renewal is anti-
Feedback sender give second FPGA module.
Preferably, described sender CPU, for determining whether first upper electricity or link change, if so,
Then renewal Timeout will be calculated according to t1, the Timeout after renewal is fed back into sender to FPGA,
When link is constant if not, Timeout values are not updated to FPGA.
Preferably, described sender CPU, it is specifically used for:It is calculated according to the δ that t1 and user are set
The Timeout=t1+t1x δ of renewal.
Preferably, first FPGA module, is specifically used for:Pass through predetermined period DetectInterval
Periodically the detection messages for carrying first state position are sent to recipient.
Preferably, second FPGA module, is specifically used for:It is inscribed in default time-out time Timeout
The reply message is received, judges whether the first state position is default normal status value, if so, then will
Receiver system information working properly is sent to sender CPU, then by Timeout timer
Put completely, into next detection cycle;If it is not, and determine that detection messages described in transmission preset times receive
It is default abnormal condition value to the first state position, then by TimeOut timer clear 0, simultaneously will
DetectInterval timer clear 0, the information of the receiver system operation irregularity is sent to sender
CPU;, will if described sender does not receive the reply message by FPGA in Timeout
TimeOut timer clear 0, while by DetectInterval timer clear 0, by doubtful link failure
Information is sent to sender CPU.
Preferably, described sender CPU, it is specifically used for:Renewal Timeout is calculated, after renewal
Timeout feedback senders give second FPGA module, after getting the doubtful link failure information,
ICMP echo request messages are initiated to the recipient, pass through obtained error message detecting link failure
Position.
The present invention performs the agreement between sender and recipient, passback side's hair by using FPGA hardware
Link detecting request is played, the request message that recipient will be initiated passback side after receiving carries out reply confirmation, leads to
Hardware is crossed to perform agreement, cpu load can be effectively reduced, increased operation rate;Can be in CPU events
Accident analysis is effectively carried out during barrier, application program occur deadlock, interference cause program fleet and
Other failures of CPU and causing can effectively help in the state of the machine of delaying the side of detection judge to be the system failure and
Non- link failure.
Brief description of the drawings
Fig. 1 is a kind of method flow diagram of the method for judgement link failure in the embodiment of the present invention one;
Fig. 2 is a kind of method flow diagram one of the method for judgement link failure in the embodiment of the present invention two;
Fig. 3 is a kind of method flow diagram two of the method for judgement link failure in the embodiment of the present invention two;
Fig. 4 is the flow chart of the sender of the example in the embodiment of the present invention two and example IV;
Fig. 5 is the flow chart of the recipient of the example in the embodiment of the present invention two and example IV;
Fig. 6 is to consult the flow chart of interaction the time of the example in the embodiment of the present invention two and example IV;
Fig. 7 is a kind of structured flowchart of the system of judgement link failure in the embodiment of the present invention three;
Fig. 8 is a kind of structured flowchart of the system of judgement link failure in the embodiment of the present invention four.
Embodiment
The present invention is described in further detail with reference to the accompanying drawings and examples.It is understood that this
The specific embodiment of place description is used only for explaining the present invention, rather than limitation of the invention.Also need in addition
It is noted that for the ease of description, part related to the present invention illustrate only in accompanying drawing and not all knot
Structure.
Embodiment one
A kind of flow chart of the method for judgement link failure that Fig. 1 provides for the embodiment of the present invention one, this implementation
Example is applicable to the interaction of two equipment in IP network, and be particularly suitable for use in base station and core net in IP network
Interaction scenario, this method can be by the recipient'ss including the sender with FPGA and the FPGA carried
IP network performs, and specifically comprises the following steps:
S110, sender send the detection messages for carrying first state position by FPGA to recipient,
Wait the reply message of recipient.
Wherein, sender includes sender system and sender's FPGA moulds with sender CPU
Block, sender can be base station, including the system of base station CPU and base station FPGA module.First shape
State position is that first state position is default value in this step for representing recipient's CPU working conditions.Tool
Body, sender sends the detection messages for carrying first state position by FPGA to recipient, starts
The default time-out time Timeout timers of acquiescence start timing, wait the reply message of recipient.
S120, recipient receive detection messages by FPGA, according to recipient CPU working condition
First state position is updated, and the reply message with first state position is replied to sender.
Wherein, recipient includes receiver system and recipient's FPGA moulds with recipient CPU
Block, recipient can be core net, including core net CPU system and core net FPGA module.
First state position is is updated according to core net CPU working conditions, if core net CPU working conditions are just
Often, then first state position is updated to default normal status value, if core net CPU working conditions are abnormal,
First state position is updated to default abnormal condition value.
S130:If sender receives reply message by FPGA in default time-out time Timeout,
Judge whether first state position is default normal status value, if, it is determined that communication is normal,;If it is not, and
It is determined that it is default abnormal condition value to send preset times detection messages to receive first state position, it is determined that
Receiver system operation irregularity, the information of receiver system operation irregularity is sent to sender CPU;Send
Fang Ruo does not receive reply message by FPGA in Timeout, it is determined that link is abnormal.
Such as:If sender receives reply message by FPGA in default time-out time Timeout,
Judge whether first state position is default normal status value, if, it is determined that communication is normal, is by recipient
Information working properly of uniting is sent to sender CPU;If it is not, and determine that transmission preset times detection messages are equal
It is default abnormal condition value to receive first state position, it is determined that receiver system operation irregularity, is predicated
The system failure, the information of receiver system operation irregularity is sent to sender CPU;If sender passes through
FPGA does not receive reply message in Timeout, it is determined that link is abnormal, and doubtful link failure is believed
Breath is sent to sender CPU.
Wherein, if it is not, and determining that it is default non-to send preset times detection messages to receive first state position
Normal status value, then the information of receiver system operation irregularity is sent to sender CPU, is specially:If
It is no, then the information of the doubtful operation irregularity of receiver system is sent to sender CPU, by FPGA to connecing
Debit sends one and carries the detection messages of first state position again, if the of the reply message received again
One mode bit is default abnormal condition numerical value, sends an inspection for carrying first state position to recipient again
Text is observed and predicted, until transmission times exceeds preset times, then stops sending and sent to sender CPU receiving
The information of system operation irregularity, and the information of receiver system operation irregularity is sent to sender CPU.
In wherein default time-out time Timeout, refer to that sender sends one in S110 and carries the
The detection messages of one mode bit started in the period of timing.
The technical scheme of the present embodiment, performed by using FPGA hardware between sender and recipient
Agreement, sender initiate link detecting request, and the request message that recipient will be initiated sender after receiving enters
Row, which is replied, to be confirmed, is performed agreement by hardware, can be effectively reduced cpu load, increase operation rate;
And accident analysis can be effectively carried out in cpu fault.Application program occur deadlock, interference cause
Program fleet and the other failures of CPU and effectively the side of detection can be helped to sentence in the state of causing the machine of delaying
Disconnected is the system failure rather than link failure.
On the basis of above-mentioned technical proposal, S110 preferably can be specially:Sender by FPGA to
Recipient sends one and carries the detection messages of first state position, and starts timing, waits the reply of recipient
Message.Wherein, start timing and refer to start recording round-trip delay, and S120 can be specially preferably:
Sender is received back to multiple message by FPGA in default time-out time Timeout, and end timing obtains past
Time delay t1 is returned, t1 is sent to sender CPU to calculate renewal Timeout, by the Timeout after renewal
Sender is fed back to FPGA.S110 and S120, which is so set, to be advantageous in that and receives reply message every time
Timeout is recalculated by sender CPU, and is allocated to FPGA module, is avoided in reality
Chain-circuit time delay occurs that erroneous judgement or live effect are bad when uncertain.
On the basis of above-mentioned technical proposal, in S110:T1 is sent to sender CPU to calculate more
New Timeout, preferably:T1 is sent to sender CPU, by sender CPU according to t1 and
The Timeout=t1+t1x δ of renewal are calculated in the δ that user is set.
On the basis of above-mentioned technical proposal, in S110:T1 is sent to sender CPU to calculate more
New Timeout, preferably:First upper electricity or link change are determine whether by sender CPU, if
It is that t1 is then sent to sender CPU to calculate renewal Timeout, the Timeout after renewal is fed back
Sender when link is constant if not, does not update Timeout values to FPGA to FPGA.Such benefit
It is to receive reply message every time to judge whether link change Determines are carried out by sender CPU
Again Timeout, and FPGA module is allocated to, reduction computes repeatedly, more efficient, and avoids in reality
Border chain-circuit time delay occurs that erroneous judgement or live effect are bad when uncertain, one can be obtained in link change
The time delay reference value being consistent with physical link time delay.
Embodiment two
A kind of flow chart of the method for judgement link failure that Fig. 2 provides for the embodiment of the present invention two, this implementation
For example on the basis of the various embodiments described above, further optimize sender sends and receives step.
S210:Sender is periodically sent out by the predetermined period DetectInterval in FPGA to recipient
The detection messages for carrying first state position are sent, wait the reply message of recipient.
S220:Recipient receives detection messages by FPGA, according to recipient CPU working condition
First state position is updated, and the reply message with first state position is replied to sender.
S230:If sender receives reply message by FPGA in default time-out time Timeout,
Judge whether first state position is default normal status value, if so, then by receiver system letter working properly
Breath is sent to sender CPU, then puts Timeout timer completely, into next detection cycle;
If it is not, and determine that it is default abnormal condition value to send preset times detection messages to receive first state position,
By TimeOut timer clear 0, while by DetectInterval timer clear 0, be judged as system therefore
Barrier, sender CPU is sent to by the information of receiver system operation irregularity;If sender passes through FPGA
Do not receive reply message in Timeout, then by TimeOut timer clear 0, simultaneously will
DetectInterval timer clear 0, sender CPU is sent to by doubtful link failure information.
Wherein, Timeout timer is put completely, as entered again in next detection cycle as needed
Downlink probe is prepared, when such as sending the detection messages for carrying first state position again according to S210,
Start the timer for putting full Timeout;TimeOut timer clear 0, as Timeout time
Wait is used up to the time limit, by DetectInterval timer clear 0, as DetectInterval timing
The device time is used up and carries first state for one to recipient's transmission, it is necessary to perform S210 and perform next cycle
The operation of the detection messages of position.
As shown in Figure 4, Figure 5 and Figure 6, the present embodiment is explained by following instance:In main survey side user
Configure an acceptable error time delay coefficient δ and detection time interval D etectInterval.δ:When be delayed
Poor coefficient, scope are 0~5, default time 1;DetectInterval:The time interval of fault detect,
Can be 1ms~1s, default time 1s with scope;To wait timeout interval, default time is TimeOut
1s。
1) sender eNode is sent out by FPGA by DetectInterval gap periods to tested measurement equipment
State=0 DetectRequest messages are sent, and starts TimeOut (being defaulted as 1s) timer and opens
Beginning timing, then wait recipient EPC reply message.
2) after recipient EPC receives the message, by State positions if recipient's EPC system is normal
1 is put, otherwise State positions 0, and DetectReply message is replied to initiation sender eNode.
3) after sender eNode receives DetectReply by FPGA, it is past that stopping counting recording this
Time delay t1 is returned, and the side of returning back CPU sends t1 and State.
The δ that sender eNodeCPU is set according to t1 and user is calculated, and obtains error real at one
Meet the desired wait timeout interval TimeOut of client, and feed back to passback side FPGA:
TimeOut=t1+t1x δ, while carry out breakdown judge.
If sender eNode receives reply report by FPGA in default time-out time Timeout
Text simultaneously judges State=1, then it is assumed that receiving device system is normal, and TimeOut is put completely.
If sender eNode receives reply report by FPGA in default time-out time Timeout
The State=0 of text, doubtful receiver system are delayed machine, by TimeOut clear 0, while by DetectInterval
Clear 0, immediately to tested second of detection of initiation, if still receiving returning for State=0 after third time detects
It is multiple, then it is assumed that recipient's EPC system is delayed machine.To CPU sending and receiving devices failure messages, and stop
Backup receiving device is switched to its passback or by return path.
If sender eNode does not receive any reply by FPGA in default time-out time Timeout,
Then doubtful link is unreachable, by TimeOut clear 0, while by DetectInterval clear 0.Notify CPU,
ICMP echo request messages are initiated from sender eNodeCPU to it, are visited according to corresponding error message
Surveyor's chain road abort situation.
The technical scheme of the present embodiment, time delay negotiation is carried out by hardware (FPGA), effectively reduces CPU
Load, increases operation rate;The method that passage time consults determines the waiting-timeout being consistent with a business time,
Can effectively evade chain-circuit time delay it is unstable, fixing band do not come failure erroneous judgement influence;Can be in cpu fault
When effectively carry out accident analysis.Application program occur deadlock, interference cause program fleet and CPU
Other failures and cause that can effectively to help in the state of the machine of delaying the side of detection to judge be the system failure rather than chain
Road failure.
On the basis of above-mentioned each embodiment, after S230 determination link exception, in addition to S240:
Sender CPU initiates ICMP echo request messages to recipient, is detected by obtained error message
Link failure position.The benefit of the S240 is that can find link failure position by step, pass through
S210-S240 can be with the abort situation of quick diagnosis outgoing link.
Embodiment three
Fig. 7 show a kind of structural representation of the system of judgement link failure of the offer of the embodiment of the present invention three,
The present embodiment is applicable to the interaction of any two equipment in IP network, and be particularly suitable for use in base station in IP network
It is as follows with the interaction scenario of core net, the concrete structure of the system:
The system includes sender 310 and recipient 320, and sender 310 includes the first FPGA module
311st, the second FPGA module 312 and sender CPU313, recipient 320 include the 3rd FPGA
Module 321 and recipient CPU322.Sender CPU313 connect respectively the first FPGA module 311,
Second FPGA module 312, the FPGA module 321 of recipient CPU322 connections the 3rd, the first FPGA
Module 311, the second FPGA module and the 3rd FPGA module 321 are complete by IP network link connection respectively
Into interactive communication.
First FPGA module 311, for sending the detection messages for carrying first state position to recipient,
Wait the reply message of recipient.
Wherein, sender includes sender system, the first FPGA module with sender CPU313
311 and second FPGA module 312, the first FPGA module 311 and the second FPGA module 312
Can be same fpga chip on hardware, the first sender can be base station, including base station CPU
System and base station FPGA module.First state position is for representing recipient's CPU322 working conditions
, first state position is default value in this step.Specifically, the first FPGA module 311 is sent out to recipient
The detection messages for carrying first state position are sent, start the default time-out time Timeout timers of acquiescence
Start timing, wait the reply message of recipient.
3rd FPGA module 321, for receiving detection messages, according to recipient CPU working condition
First state position is updated, and the reply message with first state position is replied to sender.
Wherein, recipient 320 includes the receiver system with recipient CPU322 and recipient
FPGA module 321, recipient 320 can be core net, include core net CPU system and core
Net FPGA module.First state position is is updated according to core net CPU working conditions, if core net
CPU working conditions are normal, then first state position is updated to default normal status value, if core net CPU works
Make abnormal state, then first state position is updated to default abnormal condition value.
Second FPGA module 312, for receiving reply message in default time-out time Timeout,
Judge whether first state position is default normal status value, if, it is determined that communication is normal;If it is not, and really
Surely it is default abnormal condition value to send preset times detection messages and receive first state position, it is determined that is connect
Debit's system is working properly;If sender does not receive reply message in Timeout, it is determined that link is different
Often.Wherein, in the second FPGA module 312, if first state position is not default normal status value, and really
Surely it is default abnormal condition value to send preset times detection messages and receive first state position, then will receive
The information of method, system operation irregularity is sent to sender CPU313, is specially:If first state position is not pre-
If normal status value, then the information of the doubtful operation irregularity of receiver system is sent to sender CPU313,
The detection messages for carrying first state position are sent again to recipient, if the reply message received again
First state position be default abnormal condition numerical value, send one to recipient again and carry first state position
Detection messages, until transmission times exceed preset times, then stop sending and to sender CPU313 hairs
The information of reception system operation irregularity is sent, and the information of receiver system operation irregularity is sent to sender
CPU313。
Wherein, preset in time-out time Timeout, refer to that sender 310 passes through the first FPGA module
311, which send the detection messages with first state position, started in the period of timing.
Such as:For receiving reply message in default time-out time Timeout, first state position is judged
Whether it is default normal status value, if, it is determined that communication is normal, by the letter that receiver system is working properly
Breath is sent to sender CPU;If it is not, and determine that sending preset times detection messages receives first state
Position is default abnormal condition value, it is determined that receiver system is working properly, by receiver system operation irregularity
Information be sent to sender CPU;If sender does not receive reply message in Timeout, it is determined that
Link is abnormal.
The technical scheme of the present embodiment, performed by using FPGA hardware between sender and recipient
Agreement, passback side initiate link detecting request, and recipient will enter after receiving to the request message that passback side initiates
Row, which is replied, to be confirmed, is performed agreement by hardware, can be effectively reduced cpu load, increase operation rate;
Furthermore, it is possible to accident analysis is effectively carried out in cpu fault.There is deadlock, interference in application program
Cause program fleet and the other failures of CPU and can effectively help to detect in the state of causing the machine of delaying
It is the system failure rather than link failure that side, which judges,.
On the basis of above-mentioned technical proposal, the first FPGA module 311 preferably can be specially:To reception
Side sends one and carries the detection messages of first state position, and starts timing, waits the reply message of recipient.
Wherein, start timing and refer to start recording round-trip delay, and the second FPGA module 312 can preferably have
Body is used for:Reply message is received in default time-out time Timeout, terminates timing and obtains round-trip delay
T1, t1 is sent to sender CPU, judges whether first state position is default normal status value, if so,
Receiver system information working properly is then sent to sender CPU;If it is not, and determine to send default time
It is default abnormal condition value that number detection messages, which receive first state position, then receiver system works different
Normal information is sent to sender CPU;, will if sender does not receive reply message in Timeout
Doubtful link failure information is sent to sender CPU.
And sender CPU311, for calculating renewal Timeout according to t1, by the Timeout after renewal
Feedback sender give the second FPGA module 312.First FPGA module 311, the second FPGA module 312
So being set with sender CPU311 is advantageous in that receiving reply message every time passes through sender CPU
To recalculate a Timeout, and FPGA module is allocated to, avoids not knowing in physical link time delay
When occur erroneous judgement or live effect it is bad.
On the basis of above-mentioned technical proposal, sender CPU311 is preferably specifically used for:According to t1 and use
The Timeout=t1+t1x δ of renewal are calculated in the δ that family is set.
The parameter δ set by user, the method that passage time consults determine a time-out being consistent with business
Stand-by period Timeout, can effectively evade chain-circuit time delay it is unstable, fixing band do not come failure erroneous judgement influence.
On the basis of above-mentioned technical proposal, sender CPU311 is preferably specifically used for:T1 is sent to
Sender CPU updates Timeout to calculate, preferably:First upper electricity or link change are determine whether,
If so, t1 is then sent to sender CPU to calculate renewal Timeout, by the Timeout after renewal
Sender is fed back to FPGA, when link is constant if not, does not update Timeout values to FPGA.It is such
Benefit is to receive to reply whether message judges link change Determines by sender CPU every time
Timeout, and be allocated to FPGA module again is carried out, reduction computes repeatedly, more efficient, and avoids
Occur that erroneous judgement or live effect are bad when physical link time delay is uncertain, can be obtained in link change
One time delay reference value being consistent with physical link time delay.
Example IV
Fig. 8 is a kind of structural representation of the system for judgement link failure that the embodiment of the present invention four provides, this
Embodiment is on the basis of the various embodiments described above, preferably by the first FPGA module and the second FPGA module
Further optimization.The system includes sender 410 and recipient 420, and sender 410 includes first
FPGA module 411, the second FPGA module 412 and sender CPU413, wrap in recipient 420
Include the 3rd FPGA module 421 and recipient CPU422.Sender CPU413 connects first respectively
FPGA module 411, the second FPGA module 412, the FPGA module of recipient CPU422 connections the 3rd
421, the first FPGA module 411, the second FPGA module 412 are distinguished with the 3rd FPGA module 421
Interactive communication is completed by IP network link connection.
First FPGA module 411, is specifically used for:By predetermined period DetectInterval periodically to connecing
Debit sends the detection messages for carrying first state position, waits the reply message of recipient.
3rd FPGA module 421, for receiving detection messages, according to recipient CPU working condition
First state position is updated, and the reply message with first state position is replied to sender.
Second FPGA module 412, is specifically used for:Reply is received in default time-out time Timeout
Message, judge whether first state position is default normal status value, if so, then by receiver system work just
Normal information is sent to sender CPU, and Timeout timer is put completely, into next detection cycle;
If it is not, and determine that it is default abnormal condition value to send preset times detection messages to receive first state position,
Then by TimeOut timer clear 0, while by DetectInterval timer clear 0, it is by recipient
The information of system operation irregularity is sent to sender CPU;If not receiving reply message in Timeout,
By TimeOut timer clear 0, while by DetectInterval timer clear 0, by doubtful link therefore
Barrier information is sent to sender CPU.
Wherein, Timeout timer is put full, as re-starts Timeout's as needed
Time waits, and such as sends a detection report for carrying first state position again by the second FPGA module 412
Wen Shi, open the timer for putting full Timeout;TimeOut timer clear 0, as Timeout
Time wait use up to the time limit, by DetectInterval timer clear 0, as DetectInterval
Timer periods use up, it is necessary to the second FPGA module 412 perform next cycle to recipient send one
The operation of the individual detection messages with first state position.
The present embodiment is explained by following instance:In the main acceptable error time delay of survey side's user configuration one
Coefficient δ and detection time interval D etectInterval.δ:Time delay error coefficient, scope are 0~5, acquiescence
Time is 1;DetectInterval:The time interval of fault detect, can be 1ms~1s with scope, during acquiescence
Between be 1s;TimeOut is to wait timeout interval, default time 1s.
1) sender eNode the first FPGA module by DetectInterval gap periods to be detected
Equipment sends State=0 DetectRequest messages, and it is fixed to start TimeOut (being defaulted as 1s)
When device start timing, then wait recipient reply message.
2) after the 3rd FPGA module of recipient EPC receives the message, if receiver system is normal
Then by State positions 1, otherwise State positions 0, and to one DetectReply of device replied for initiating passback
Message.
3) after sender eNode the second FPGA module passback side receives DetectReply, meter is stopped
Number scale records this round-trip delay t1, and the side of returning back CPU sends t1 and State.
The δ that sender eNode is set according to t1 and user is calculated, and real error meets when obtaining one
The desired wait timeout interval TimeOut of client, and feed back to sender eNode the 2nd FPGA
Module:TimeOut=t1+t1x δ, while carry out breakdown judge.
If sender eNode is received by the second FPGA module in default time-out time Timeout
Reply message and judge State=1, then it is assumed that recipient's EPC system is normal, and TimeOut is put completely.
If sender eNode is received by the 2nd FPG modules As in default time-out time Timeout
To the State=0 for replying message, doubtful receiver system is delayed machine, and TimeOut clear 0 simultaneously will
DetectInterval clear 0, immediately to tested second of detection of initiation, if still received after third time detects
State=0 reply, then it is assumed that receiver system is delayed machine.Send and receive to sender eNode CPU
Equipment fault message, and stop being switched to backup receiving device to its passback or by return path.
Sender eNode is not received in default time-out time Timeout any by the second FPGA module
Reply, then doubtful link is unreachable, by TimeOut clear 0, while by DetectInterval clear 0.Notice
CPU, ICMP echo request messages are initiated to it from sender eNode CPU, according to corresponding
Error message detecting link abort situation.
The technical scheme of the present embodiment, time delay negotiation is carried out by hardware (FPGA), effectively reduces CPU
Load, increases operation rate;The method that passage time consults determines a waiting-timeout being consistent with specific business
Time, it can effectively evade unstable chain-circuit time delay, link switching and erroneous effects caused by subjective judgement;
Time delay through consultation can quickly carry out breakdown judge in real time, and the method consulted using hardware can be
Accident analysis is effectively carried out during cpu fault, application program occur deadlock, interference cause program fleet,
And the other failures of CPU and cause that can effectively to help in the state of the machine of delaying the side of detection to judge be system therefore
Barrier rather than link failure.
On the basis of above-mentioned each embodiment, sender CPU413, it is specifically used for:Calculate renewal
Timeout, give the Timeout feedback senders after renewal to the second FPGA module, get doubtful chain
After the fault message of road, ICMP echo request messages are initiated to recipient, are visited by obtained error message
Surveyor's chain road abort situation.Sender CPU413 benefit is that sender CPU413 passes through ICMP echo
Request messages can find link failure position, can be with quick diagnosis by the system of this judgement link failure
The abort situation of outgoing link.
In summary, the present invention carries out link failure judgement by way of hardware, effectively reduces CPU
Load, increases operation rate, and system is delayed can make detection side in the case of machine effectively to detection side's answering system state
It is determined that it is the non-transmitting link failure of the system failure.The method that passage time consults determines to wait time-out time, energy
Effectively evading unstable chain-circuit time delay, link switching and false judgment caused by subjective judgement influences, and passes through
The time delay of negotiation can quickly carry out breakdown judge in real time.And the method consulted using hardware can be in CPU
Accident analysis is effectively carried out during failure, application program occur deadlock, interference cause program fleet and
Other failures of CPU and causing can effectively help in the state of the machine of delaying the side of detection judge to be the system failure and
Non- link failure.The present invention can count automatically in the IP network of redundancy when logical links changes
The waiting-timeout time Timout being consistent with actual transmissions is calculated and adjusts, with real-time carry out breakdown judge,
And breakdown judge and accident analysis also can be effectively carried out in cpu fault.
Pay attention to, above are only presently preferred embodiments of the present invention and institute's application technology principle.Those skilled in the art
It will be appreciated that the invention is not restricted to specific embodiment described here, can enter for a person skilled in the art
Row is various significantly to be changed, readjust and substitutes without departing from protection scope of the present invention.Therefore, though
So the present invention is described in further detail by above example, but the present invention be not limited only to
Upper embodiment, without departing from the inventive concept, other more equivalent embodiments can also be included,
And the scope of the present invention is determined by scope of the appended claims.
Claims (14)
- A kind of 1. method for judging link failure, it is characterised in that including:Sender sends the detection messages for carrying first state position by FPGA to recipient, waits institute State the reply message of recipient;The recipient receives the detection messages by FPGA, according to recipient CPU work shape State updates the first state position, and replys the reply with the first state position to described sender Message;If described sender receives the reply message by FPGA in default time-out time Timeout, Judge whether the first state position is default normal status value, if, it is determined that communication is normal;If it is not, And it is to preset improper shape to determine that detection messages described in transmission preset times receive the first state position State value, it is determined that receiver system operation irregularity;If described sender by FPGA in Timeout not Receive the reply message, it is determined that link is abnormal.
- 2. according to the method for claim 1, it is characterised in thatDescribed sender sends the detection messages for carrying first state position by FPGA to recipient, etc. The reply message of the recipient is treated, is specifically included:Sender sends a band by FPGA to recipient There are the detection messages of first state position, and start timing, wait the reply message of the recipient;If the described sender receives the reply message by FPGA in Timeout, specific bag Include:Described sender receives the reply message by FPGA in default time-out time Timeout, Terminate timing and obtain round-trip delay t1, t1 is sent to sender CPU to calculate renewal Timeout, will Timeout after renewal feeds back sender to FPGA.
- 3. according to the method for claim 2, it is characterised in that described that t1 is sent to sender CPU To calculate renewal Timeout, the Timeout after renewal is fed back into sender to FPGA, is specially:It is logical Cross sender CPU and determine whether first upper electricity or link change, if so, t1 then is sent into sender CPU calculates renewal Timeout, the Timeout after renewal is fed back into sender to FPGA, if not chain When road is constant, Timeout values are not updated to FPGA.
- 4. according to the method for claim 2, it is characterised in that described that t1 is sent to sender CPU updates Timeout to calculate, and specifically includes:T1 is sent to sender CPU, passes through sender The Timeout=t1+t1x δ of renewal are calculated in the δ that CPU is set according to t1 and user.
- 5. according to the method for claim 1, it is characterised in that described sender by FPGA to Recipient sends the detection messages for carrying first state position, specifically includes:Described sender is periodically sent by the predetermined period DetectInterval in FPGA to recipient One detection messages for carrying first state position.
- 6. according to the method for claim 1, it is characterised in that it is described to determine that communication is normal, specifically Including:Receiver system information working properly is sent to sender CPU, then by Timeout Timer put it is full, into next detection cycle;The determination receiver system operation irregularity, it is specially:By TimeOut timer clear 0, simultaneously By DetectInterval timer clear 0, the information of the receiver system operation irregularity is sent to transmission Square CPU;The determination link is abnormal, is specially:By TimeOut timer clear 0, simultaneously will DetectInterval timer clear 0, sender CPU is sent to by doubtful link failure information.
- 7. according to the method for claim 1, it is characterised in that after the determination link exception, Also include:Described sender CPU initiates ICMP echo request messages to the recipient, by The error message detecting link abort situation arrived.
- A kind of 8. system for judging link failure, it is characterised in that including:Sender and recipient, institute Stating sender includes the first FPGA module, the second FPGA module and sender CPU, described to connect Debit includes the 3rd FPGA module and recipient CPU;First FPGA module, for sending the detection messages for carrying first state position to recipient, etc. Treat the reply message of the recipient;3rd FPGA module, for receiving the detection messages, according to the work of the recipient CPU Make state and update the first state position, and replied to described sender with described in the first state position Reply message;Second FPGA module, for receiving the reply message in default time-out time Timeout, Judge whether the first state position is default normal status value, if, it is determined that communication is normal;If it is not, And it is to preset improper shape to determine that detection messages described in transmission preset times receive the first state position State value, it is determined that receiver system is working properly;If described sender does not receive described in Timeout Reply message, it is determined that link is abnormal.
- 9. system according to claim 8, it is characterised in that the first FPGA module is specific to use In:One is sent to recipient and carries the detection messages of first state position, and starts timing, is connect described in wait The reply message of debit;Second FPGA module, is specifically used for:Received in default time-out time Timeout described Message is replied, terminates timing and obtains round-trip delay t1, t1 is sent to sender CPU, judge described the Whether one mode bit is default normal status value, if, it is determined that communication is normal;If it is not, and determine to send It is default abnormal condition value that detection messages described in preset times, which receive the first state position, it is determined that Receiver system is working properly;If described sender does not receive the reply message in Timeout, Determine link exception;Described sender CPU, Timeout is updated for being calculated according to t1, the Timeout after renewal is anti- Feedback sender give second FPGA module.
- 10. system according to claim 9, it is characterised in that described sender CPU, for sentencing Whether fixed be first upper electricity or link change, if so, then renewal Timeout will be calculated according to t1, will more Timeout feedback senders after new when link is constant if not, do not update Timeout to FPGA to FPGA Value.
- 11. system according to claim 8, it is characterised in that described sender CPU, it is specific to use In:The Timeout=t1+t1x δ of renewal are calculated in the δ set according to t1 and user.
- 12. system according to claim 8, it is characterised in thatFirst FPGA module, is specifically used for:By predetermined period DetectInterval periodically to Recipient sends the detection messages for carrying first state position.
- 13. system according to claim 8, it is characterised in thatSecond FPGA module, is specifically used for:Received in default time-out time Timeout described Message is replied, judges whether the first state position is default normal status value, if so, then by the reception Method, system information working properly is sent to sender CPU, then puts completely Timeout timer, enters Enter next detection cycle;If it is not, and detection messages described in determining to send preset times receive described the One mode bit is default abnormal condition value, then by TimeOut timer clear 0, simultaneously will DetectInterval timer clear 0, the information of the receiver system operation irregularity is sent to sender CPU;, will if described sender does not receive the reply message by FPGA in Timeout TimeOut timer clear 0, while by DetectInterval timer clear 0, by doubtful link failure Information is sent to sender CPU.
- 14. system according to claim 8, it is characterised in that described sender CPU, it is specific to use In:Renewal Timeout is calculated, gives the Timeout feedback senders after renewal to the 2nd FPGA moulds Block, after getting the doubtful link failure information, ICMP echo request are initiated to the recipient Message, pass through obtained error message detecting link abort situation.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610340855.8A CN107404393A (en) | 2016-05-20 | 2016-05-20 | A kind of method and system for judging link failure |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610340855.8A CN107404393A (en) | 2016-05-20 | 2016-05-20 | A kind of method and system for judging link failure |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107404393A true CN107404393A (en) | 2017-11-28 |
Family
ID=60389537
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610340855.8A Withdrawn CN107404393A (en) | 2016-05-20 | 2016-05-20 | A kind of method and system for judging link failure |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107404393A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108908402A (en) * | 2018-07-06 | 2018-11-30 | 浙江国自机器人技术有限公司 | A kind of detection method and system of robot hardware |
CN109327333A (en) * | 2018-09-30 | 2019-02-12 | 潍柴动力股份有限公司 | A kind of message stops paying out method and device |
CN110138657A (en) * | 2019-05-13 | 2019-08-16 | 北京东土军悦科技有限公司 | Aggregated link switching method, device, equipment and the storage medium of inter-exchange |
CN110519096A (en) * | 2019-08-29 | 2019-11-29 | 西安电子工程研究所 | RocketIO communication link detects automatically and restoration methods |
CN112688826A (en) * | 2019-10-18 | 2021-04-20 | 中车株洲电力机车研究所有限公司 | Link diagnosis method, terminal device, link diagnosis system, and storage medium |
CN116155774A (en) * | 2022-12-20 | 2023-05-23 | 中国联合网络通信集团有限公司 | Link detection method, device and storage medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050141563A1 (en) * | 1996-03-29 | 2005-06-30 | Cisco Technology, Inc., A California Corporation | Communication server apparatus providing XDSL services and method |
CN1925429A (en) * | 2006-09-30 | 2007-03-07 | 杭州华为三康技术有限公司 | Method and equipment for realizing fast detection |
CN102548011A (en) * | 2011-01-04 | 2012-07-04 | 中国移动通信集团公司 | Semi-persistent scheduling and receiving method, system and device of relaying access link |
CN104917624A (en) * | 2014-03-10 | 2015-09-16 | 华耀(中国)科技有限公司 | Health check system and method for link aggregation path |
-
2016
- 2016-05-20 CN CN201610340855.8A patent/CN107404393A/en not_active Withdrawn
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050141563A1 (en) * | 1996-03-29 | 2005-06-30 | Cisco Technology, Inc., A California Corporation | Communication server apparatus providing XDSL services and method |
CN1925429A (en) * | 2006-09-30 | 2007-03-07 | 杭州华为三康技术有限公司 | Method and equipment for realizing fast detection |
CN102548011A (en) * | 2011-01-04 | 2012-07-04 | 中国移动通信集团公司 | Semi-persistent scheduling and receiving method, system and device of relaying access link |
CN104917624A (en) * | 2014-03-10 | 2015-09-16 | 华耀(中国)科技有限公司 | Health check system and method for link aggregation path |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108908402A (en) * | 2018-07-06 | 2018-11-30 | 浙江国自机器人技术有限公司 | A kind of detection method and system of robot hardware |
CN109327333A (en) * | 2018-09-30 | 2019-02-12 | 潍柴动力股份有限公司 | A kind of message stops paying out method and device |
CN110138657A (en) * | 2019-05-13 | 2019-08-16 | 北京东土军悦科技有限公司 | Aggregated link switching method, device, equipment and the storage medium of inter-exchange |
CN110138657B (en) * | 2019-05-13 | 2021-11-09 | 北京东土军悦科技有限公司 | Aggregation link switching method, device, equipment and storage medium between switches |
CN110519096A (en) * | 2019-08-29 | 2019-11-29 | 西安电子工程研究所 | RocketIO communication link detects automatically and restoration methods |
CN112688826A (en) * | 2019-10-18 | 2021-04-20 | 中车株洲电力机车研究所有限公司 | Link diagnosis method, terminal device, link diagnosis system, and storage medium |
CN112688826B (en) * | 2019-10-18 | 2022-05-20 | 中车株洲电力机车研究所有限公司 | Link diagnosis method, terminal device, link diagnosis system, and storage medium |
CN116155774A (en) * | 2022-12-20 | 2023-05-23 | 中国联合网络通信集团有限公司 | Link detection method, device and storage medium |
CN116155774B (en) * | 2022-12-20 | 2024-04-16 | 中国联合网络通信集团有限公司 | Link detection method, device and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107404393A (en) | A kind of method and system for judging link failure | |
JP5249950B2 (en) | Method and system for utility network outage detection | |
CN100571173C (en) | A kind of detection method of network node abnormality | |
CN102143522A (en) | Method and equipment for processing radio link failure | |
CN103957538A (en) | Method and device for detecting network quality | |
CN102064981A (en) | Bidirectional forwarding detection (BFD) method and system | |
CN104272659B (en) | The user equipment of the relative set in the method and communication network of data is transmitted in the communication network towards bag | |
CN106301986A (en) | Chain circuit detecting method and device | |
CN102118278B (en) | Method and system for measuring network conditions as well as method for monitoring network coverage | |
CN103036696A (en) | Achievement method and system and corresponding device of online business | |
CN109728967A (en) | Communication quality detection method, communication equipment and system | |
CN103684818A (en) | Method and device for detecting failures of network channel | |
CN103957552B (en) | The method for improving data communication reliability in automatic weather station | |
CN107404735A (en) | A kind of uplink data transmission method and system, user equipment and base station | |
CN102137420A (en) | Voice channel detection method and base station controller | |
CN102271067B (en) | Network detecting method, apparatus and system | |
CN100563201C (en) | A kind of method for detecting route unit fault and device | |
CN104243199A (en) | Data transmission method and protection device of packet transport network | |
CN110971459B (en) | Session fault detection method and device, terminal equipment and readable storage medium | |
CN101155078A (en) | Method for fast locating IP network fault | |
CN102014054A (en) | Sending method and equipment for keep-alive messages | |
CN106549784A (en) | A kind of data processing method and equipment | |
EP3869740A1 (en) | Network reliability testing method and apparatus | |
CN109474940A (en) | Quality of service detection method and device | |
CN106982127B (en) | Message detection and distribution method in convergence charging and tandem proxy device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WW01 | Invention patent application withdrawn after publication | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20171128 |