WO2013044281A1 - Method for a clock-rate correction in a network consisting of nodes - Google Patents

Method for a clock-rate correction in a network consisting of nodes Download PDF

Info

Publication number
WO2013044281A1
WO2013044281A1 PCT/AT2012/050130 AT2012050130W WO2013044281A1 WO 2013044281 A1 WO2013044281 A1 WO 2013044281A1 AT 2012050130 W AT2012050130 W AT 2012050130W WO 2013044281 A1 WO2013044281 A1 WO 2013044281A1
Authority
WO
WIPO (PCT)
Prior art keywords
clock
nodes
network
subset
algorithm
Prior art date
Application number
PCT/AT2012/050130
Other languages
French (fr)
Inventor
Günther BAUER
Wilfried Steiner
Original Assignee
Fts Computertechnik Gmbh
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fts Computertechnik Gmbh filed Critical Fts Computertechnik Gmbh
Priority to EP12765979.5A priority Critical patent/EP2761794A1/en
Publication of WO2013044281A1 publication Critical patent/WO2013044281A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04JMULTIPLEX COMMUNICATION
    • H04J3/00Time-division multiplex systems
    • H04J3/02Details
    • H04J3/06Synchronising arrangements
    • H04J3/0635Clock or time synchronisation in a network
    • H04J3/0638Clock or time synchronisation among nodes; Internode synchronisation
    • H04J3/0652Synchronisation among time division multiple access [TDMA] nodes, e.g. time triggered protocol [TTP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04JMULTIPLEX COMMUNICATION
    • H04J3/00Time-division multiplex systems
    • H04J3/02Details
    • H04J3/06Synchronising arrangements
    • H04J3/0635Clock or time synchronisation in a network
    • H04J3/0685Clock or time synchronisation in a node; Intranode synchronisation
    • H04J3/0694Synchronisation in a TDMA node, e.g. TTP

Definitions

  • the invention relates to a method for a clock-rate correction in a network consisting of nodes.
  • Fault-tolerant clock synchronization is the foundation of synchronous architectures such as the Time-Triggered Architecture (TTA) for dependable cyber-physical systems.
  • Clocks are typically local counters that are increased with a given rate according to real time, and clock synchronization algorithms ensure that any two clocks in the system read about the same value at about the same point in real time. This is achieved by a clock synchronization algorithm that changes the current values of the clocks, the clocks' rate, or both.
  • This invention discloses a clock-rate correction algorithm as layered services on top of the TTEthernet clock synchronization algorithm, which itself is a clock-state correction algorithm. Thereby, the precision in a TTEthernet system can be improved.
  • the rate-correction algorithm records the clock state-correction values for a configurable number of integration cycles. It then calculates an average of the corrected values and changes the rate of the clocks for a configurable percentage of this average. In any case the change of rate is bound by the maximum drift offset max(drift) from a perfect reference time.
  • the FlexRay communication protocol [7] specifies a rate- correction algorithm. Although our algorithm is similar to the FlexRay rate-correction approach, there are differences with respect to the underlying assumptions on topology and the clock-state correction algorithm. A combination of clock-state correction and clock-rate correction has been introduced by Kopetz et al. in [8] and analyzed by simulation and measurement. This approach elects a particular rate master, which is then used by the other nodes to align their rate to. The drawback of such an approach is, that in case of the failure of the rate master, a re-election is necessary. Our approach does not rely on a rate master and integrates state and rate correction more tightly.
  • the invention relates to a method for a clock-rate correction in a network consisting of nodes, each node having a local clock, wherein the nodes are connected to each other in an arbitrary network topology, in which a two-step clock synchronization algorithms is realized, wherein this two-step clock synchronization algorithm comprises:
  • SM first subset of nodes
  • CM second subset of nodes
  • CM nodes of the second set of nodes
  • CM_clock the result of the calculation (CM_clock) of the convergence function of the first step to the network in form of messages
  • nodes of a first subset of nodes (SM) and / or other nodes (SC) in the system that receive this messages from the second subset of nodes (CM) apply a second convergence function based on the timing information associated with these messages
  • nodes which are receiving messages from the second subset of nodes (CM) use the timing information associated with at least a subset of these messages to correct their local clocks
  • a node is keeping track of succeeding corrections applied to its local clock, and wherein a node changes the clock rate for a quantity that is a function of previous corrections applied to the local clock.
  • the two-step clock synchronization algorithm is the TTEthernet clock synchronization algorithm.
  • the invention relates to a network for carrying out a method as described above, wherein the network consists of nodes, each node having a local clock, wherein the nodes are connected to each other in an arbitrary network topology, characterized in that each nodes is either an end system or a switch.
  • the network comprises multiple tree connections, wherein each tree is formed of a disjoint subset of switches, and wherein a subset of end systems is connected to exactly one switch of each tree of the set of redundant trees.
  • subset of nodes that send messages in the first step of the two-step clock synchronization approach is a subset of the end systems.
  • the subset of nodes that perform the first convergence function can be a subset of the switches in the network.
  • the end systems use the diagnosis information of TTEthernet as specified in the AS6802 standard to keep track on the succeeding clock correction.
  • Fault-tolerant clock synchronization is the foundation of synchronous architectures such as the Time- Triggered Architecture (TTA) for dependable cyber-physical systems.
  • Clocks are typically local counters that are increased with a given rate according to real time, and clock synchronization algorithms ensure that any two clocks in the system read about the same value at about the same point in real time. This is achieved by a clock synchronization algorithm that changes the current values of the clocks, the clocks' rate, or both.
  • TTA time-triggered architecture
  • TTP [2] and TTEthernet [3] are implementations of the TTA.
  • TTP is applied, for example, in the new Boeing 787 Dreamliner, whereas TTEthernet has been selected for the NASA Orion Space Program.
  • aerospace and space industries are traditional areas for dependable systems, we also observe emerging areas with increasing dependability requirements. Examples include surgery robots in the medical area, datacenters in the financial and other critical industries, as well as the smart grid that aims at decentralized energy production and efficient energy use.
  • TTEthernet is currently being evaluated for several of these emerging areas.
  • TTEthernet integrates synchronized and unsynchronized communication on the same physical network, i.e., time-triggered frames and event-triggered frames can coexist.
  • the synchronized, time-triggered traffic relies on synchronized local clocks in the system and, therefore, TTEthernet specifies a fault-tolerant clock synchronization algorithm.
  • This algorithm is a clock state-correction algorithm: the TTEthernet devices periodically exchange the current values of their local clocks and correct their clock values appropriately.
  • the diagnosis algorithm implements a simple version of an accusation protocol as used in NASA's SPIDER protocol [5] .
  • TTEthernet devices that detect inconsistencies in the TTEthernet clock synchronization protocol report these inconsistencies to all devices in the system. Once a sufficiently high number of devices has accused a particular device of being faulty, this device is excluded from the clock synchronization protocol.
  • Serafmi et al. [6] introduce application-level diagnosis algorithms, for which they discuss an implementation and provide formal correctness proofs. Besides some algorithmic difference, from the point of view of formal verification Serafmi et al. prove their algorithms in a discrete time model and abstract from the underlying synchronization protocol, while our framework allows the integrated proof of the clock synchronization protocol together with diagnosis in a continuous time model.
  • Clock rate-correction algorithms or rate-correction algorithms for short, not only periodically realign the values of the clock counters, but also change the rate of their increment.
  • a node may diagnose that it always has to correct its local clock for about +5 ⁇ . This means that the update rate of this clock's counter is too low, or in other words, the clock ticks too slowly.
  • a rate- correction algorithm would speed-up the clock with the aim that the next correction should be less than 5 ⁇ . Of course, this assumes some stability of clock drift, which we discuss later in this paper.
  • the FlexRay communication protocol [7] specifies a rate-correction algorithm. Although our algorithm is similar to the FlexRay rate-correction approach, there are differences with respect to the underlying assumptions on topology and the clock-state correction algorithm. A combination of clock-state correction and clock-rate correction has been introduced by Kopetz et al. in [8] and analyzed by simulation and measurement. This approach elects a particular rate master, which is then used by the other nodes to align their rate to. The drawback of such an approach is, that in case of the failure of the rate master, a re-election is necessary. Our approach does not rely on a rate master and integrates state and rate correction more tightly.
  • Section 2 We continue in Section 2 with a review of the TTEthernet clock synchronization algorithm.
  • Section 3 we recapture the formal proof of this algorithm and present how its formal model is re-used in our framework for simulation and formal verification of layered algorithms. Based on this framework we have developed several new algorithms. We introduce a diagnosis algorithm in Section 4 and a clock rate-correction algorithm in Section 5. We discuss these algorithms formally and give simulation results and formal proofs using our framework. Finally, we conclude in Section 6.
  • TTEthernet is an extension of the traditional Ethernet standard by services that guarantee deterministic delivery of time-critical messages.
  • An example network with two redundant channels is depicted in Fig. 1.
  • a TTEthernet network consists of end systems and switches, where end systems are connected to switches with bi-directional communication links. Switches may connect to each other thereby forming multi-hop connections between end systems. Each switch belongs to one and only one channel and in its simplest form a channel is formed by a single switch and the communication links to the end systems.
  • a TTEthernet network implements redundant channels, e.g., two redundant channels as in Fig. 1.
  • End systems and switches are physical components to which the TTEthernet clock synchronization algorithm assigns one of three "roles” , synchronization master (SM) , compression master (CM) , or synchronization client (SC) .
  • SM synchronization master
  • CM compression master
  • SC synchronization client
  • PCF protocol control frames
  • Fig. 2 depicts the two steps in the TTEthernet clock synchronization algorithm.
  • the SMs send PCFs to the CMs.
  • the CMs extract from the arrival points in time of the PCFs the current state of their local clocks and execute a first convergence function, the so-called compression function (Alg. 1).
  • the result of the convergence function is then delivered to the SMs in form of new PCFs (the "compressed" PCFs) .
  • the SMs collect the compressed PCFs from the CMs and execute a second convergence function (Alg. 2,3) .
  • the diagnosis algorithm (Alg. 4,5) and the rate-correction algorithm (Alg. 6) analyzed in this paper are then executed only after the clocks are corrected by the TTEthernet clock synchronization algorithm.
  • TTEthernet assumes an inconsistent-omission failure model for the CMs. This means that a faulty CM is able to arbitrarily accept and reject PCFs from the SMs and can also decide to which SMs it sends the compressed PCF and to which not. Babbling idiot failures of the CM are excluded by the design of the CM as self-checking pair.
  • the SMs may fail arbitrarily, and in particular, they may start to babble PCFs.
  • the design of the CMs ensures that only one PCF per SM is used per re-synchronization cycle. However, we assume that the clock values provided by a faulty SM can be arbitrary and the faulty SM may send different clock values to the different CMs.
  • TTEthernet is configurable to tolerate multiple failures, we analyze and verify the new algorithms under a single failure fault-hypothesis. Hence, we assume either a faulty SM or a faulty CM, but not both at the same point in time.
  • the CMs collect the current states of the local clocks of the SMs. We denote these values by SM-clock and number them SM -clocki, where 1 ⁇ i ⁇ ⁇ SM ⁇ and assume that the SM-clocki values are sorted in increasing order. From the received SM _clocki , a CM j uses a variant of the fault- tolerant median to calculate the new "compressed" clock CM .clock which we number with the identifier of the respective CM: CM .clock j . Algorithm 1 defines this calculation as a function of the number of SM.clocki values (denoted by the cardinality received.
  • the compressed clock is delivered back to the SMs in a new "compressed" PCF and the SMs are able to read the compressed clock value from the arrival point in time of the compressed PCF.
  • This compressed PCF also contains the pcf.membership.new field in its payload. pcf. member ship. new is a bitvector in which each bit is assigned to a unique SM.
  • the CMs will set the bit of a SM, if the respective SM i has provided a local clock value SM.clocki in the calculation of the most recent CM _clock j and will clear the bit otherwise.
  • the self-checking pair design of the CM guarantees Algorithm 1 Convergence Algorithm executed by CM j
  • the CM may calculate the arithmetic mean of the second and fourth SM clock value for the compressed clock value.
  • the SMs receive the compressed PCFs, extract the compressed clock values from them, and correct their local clocks.
  • each SM receives exactly one compressed PCF per CM from which it extracts the compressed clock values CM_clock j , where 1 ⁇ j ⁇ ⁇ CM ⁇ and we assume the CM _clock j values are sorted in increasing order.
  • an SM may receive at maximum one compressed PCF per CM (as the faulty CM may decide not to send its compressed PCF to some SMs) . Furthermore, an SM will only use a compressed PCF in its convergence function if the pcf-membership-new field has at least accept_threshold of bits set.
  • accept_threshold is calculated using Algorithm 2: the SM searches for the maximum bits set in any of the PCFs received from the CMs. The value of accept_threshold is then given by this maximum minus the configured number of tolerable faulty SMs.
  • the SM will discard a compressed PCF that has less than accept_threshold bits set in the pcf -member ship -new field. This mechanism ensures that an SM excludes compressed PCFs that represent relatively low numbers of SM clocks.
  • the pcf_membership_new vector is also used in other TTEthernet algorithms such as clique detection or startup as well as in network configurations that use more than one CM per channel. We do not discuss this functionality and configurations in this paper. For the analysis of the clock synchronization algorithm the description above is sufficient. Algorithm 2 select ( CM -dock)
  • corr value is already an extension for the rate-correction algorithm discussed in Section 5. It stores the current correction value and is updated with each integration cycle.
  • the precision is bounded and known.
  • SAL state-transition system of the form (S, I, ⁇ ) .
  • S defines the set of system states ⁇ 3 ⁇ 4 , / the set of initial system states with / C S and ⁇ the set of transitions between system states.
  • Each system state ⁇ maps the variables to particular values according to their defined variable type.
  • SAL supports structured modeling such that we can define the SM and CM functionality in encapsulated modules.
  • SAL provides several tools (symbolic, bounded, and bounded infinite-state model checking). While we experimented with all of them, we finally use the bounded infinite-state model checker sal-inf -bmc to prove the TTEthernet synchronization quality as well as to generate testcases. With sal-inf -bmc we can treat time as a continuous entity and can use fc-induction [11] as proof method.
  • Fig. 3 shows an example scenario with a fast and a slow clock.
  • the x-axis depicts real time and the y-axis the internal clock time of a respective TTEthernet device.
  • the perfect clock is plotted as a forty-five degree solid line while the fast clock is depicted as a dashed line slightly above the perfect clock and the slow clock is depicted as a dotted-dashed line slightly below the perfect clock.
  • the figure shows for each integration cycle the divergence of the fast and slow clocks from the perfect clock and their synchronization at the beginning of each integration cycle.
  • the drift from the perfect clock is a function of the length of the integration cycle and the drift rate of the clocks. Following literature we use the R sync for the integration cycle and p for the drift rate.
  • drift Rsync x p + ⁇ error x
  • Fig. 4 shows this approach of modeling time to verify the precision.
  • the a;-axis represents real time
  • the y-axis represents the clock time deviations from the perfect clock.
  • the drift offset for an integration cycle i is added.
  • the drift offset step simulates the drift over the integration cycle, while the execution time of the clock synchronization algorithm is only a fraction of the integration cycle length.
  • model-checker approach allows us to use "wildcards" for which the tool is free to assign non-deterministic values. Hence, instead of a single simulation run that takes as input a specific test vector and analyzes the system behavior under this test, the model checker approach systematically searches the state space for all possible evaluations for each wildcard.
  • simulation with SAL is used to add new functionality to the TTEthernet clock synchronization algorithm and to explore its behavior.
  • fc-induction [11] which is a generalized form of regular induction and consists of the following stages [12] :
  • Each system state ⁇ 3 ⁇ 4 is described by at least one abstract state ⁇ j .
  • the initial abstract state X describes at least one initial system state.
  • DP.
  • each SM is described by a state machine and all state machines are executed synchronously.
  • each of these state machines has only two variables, SM_state and SM_clock, where SM_state is either sync or send and SM_clock keeps track of the divergence from the perfect clock.
  • the current system state is simply the sum of all of the current local states of the SMs.
  • Fig. 5 depicts a system-level abstraction for the TTEthernet that fulfills the abstraction properties listed above.
  • the abstraction is very simple and consists only of the two abstract states SMALL and BIG.
  • BIG is an abstract state that requires all SMs to be in the sync state at the same time while in the abstract state SMALL, all SMs must be in the send state.
  • precision will be bounded by some real constant FACT0R_small times ax(drift) in the SMALL abstract state and by some other real constant FACTOR times ax(drift) in the BIG state.
  • FACT0R_small ⁇ FACTOR holds and both numbers are derived manually or by re-running the model checking until no counterexamples are produced.
  • TTEthernet FACTOR FACTOR
  • the TTEthernet clock synchronization algorithm is inherently fault-tolerant. However, the synchronization quality decreases with the number of faulty components and the severity of their failure modes.
  • the diagnosis algorithm presented in this section aims to detect faulty TTEthernet devices, in particular faulty CMs, and remove them from the TTEthernet clock synchronization algorithm. By doing so, the failure mode of a faulty CM is transformed from an inconsistent-omission failure mode to a fail silent failure mode and we can formally verify that the diagnosis algorithm improves the precision in the system.
  • the diagnosis algorithm is based on a simple accusation protocol presented by Algorithm 4 and Algorithm 5.
  • Algorithm 4 is executed in the SMs immediately after the clocks are corrected (see Fig. 2 on the temporal dependencies of the algorithms to each other) . It starts with each SM recording those CMs from which they receive PCFs (lines 1—3) using an array active of boolean variables.
  • the symbol _L denotes the absence of a PCF and in case that the clock of CM j is not absent (hence, the SM received a PCF from CM j) the respective active[j] will be set to TRUE.
  • an SM i checks for each CM j whether it has been active before, but it did not receive a PCF in the current integration cycle. If this is the case, SM i accuses CM j to be omission faulty. For simplicity, we assume that this accusation information is stored in a local accusation matrix accused indexed by the SMs and CMs. Furthermore, SM i informs all other SMs of its accusation by sending and accusation message AC , where A C . accused is a vector of boolean variables with each boolean representing a unique CM. SM i will set ACi .accused[j] if it accuses CM j.
  • the A C messages are sent as rate- constrained traffic on all redundant channels in a TTEthernet system.
  • TTEthernet network it is, thus, ensured that the A C messages are delivered with a known upper bound in time and are transported over at least one non-faulty channel.
  • the exchange of the accusation information happens sufficiently prior to the next execution of the TTEthernet clock synchronization algorithm.
  • Algorithm 5 is executed by an SM i that receives an accusation message A C ⁇ from an SM k: when a boolean variable in A C k - accused[j] is TRUE, SM i sets the corresponding local accused to TRUE as well. Each SM, thus, uses the matrix accused to locally store all accusations from all SMs.
  • An alternative realization to modifying the selection function is the deactivation of the communication link that connects the SM to the faulty CM.
  • an SM that receives two accusations for a CM can be certain that one of the accusations stems from a correct SM.
  • a faulty SM excludes the presence of a faulty CM and, hence, even accusations of a faulty SM are distributed by all CMs consistently.
  • CM -dock j produced by the CMs will be different.
  • Fig. 6 plots the divergence of the clock times from real time as described for Fig. 4.
  • the clocks of SM 1-3 (denoted by Clock 1-3) have positive drift of 10 time units while the clocks of SMs 4 and 5 (denoted by Clock 4 and 5) have negative drift of 10 time units.
  • all SMs receive PCFs from both CMs.
  • CM 1 does not fail to send a PCF to one of the other SMs
  • SM 2 always deviates from the remaining SMs after clock correction.
  • CM 1 does not send a PCF to SM 3, which in turn also accuses CM 1.
  • SM 2 and SM 3 accuse CM 1 of being faulty, all SMs exclude CM -dock ⁇ from clock synchronization.
  • all SMs will only use CM -dock,2 for clock synchronization and the inconsistent omission failure mode of CM 1 is transformed into a fail-silence failure.
  • the diagnosis algorithm ensures that once the faulty CM is detected by a sufficiently high number of SMs, the exclusion of the faulty CM improves the precision from
  • the SM may accuse a CM only after a configurable number of lost PCFs per configured time-interval. This would mitigate the probability that a CM is accused because of a transient error or because of a bit error as the Ethernet frame is transported over the communication link. Secondly, for the same reasons the accusation may be reset in all SMs to a allow an accused CM to rejoin the TTEthernet clock synchronization algorithm. Lastly, the SMs may also take statistics on the number of lost application Ethernet frames into account in their determination whether to accuse a CM or to remove an accusation.
  • the rate-correction algorithm records the clock state-correction values for a configurable number of integration cycles. It then calculates an average of the corrected values and changes the rate of the clocks for a configurable percentage of this average. In any case the change of rate is bound by the maximum drift offset ax(drift) from a perfect reference time.
  • Algorithm 6 is executed in each SM after the clocks have been corrected by the TTEthernet state- correction algorithm (Alg. 2, 3 in Fig. 2) . It consists of an observation phase (lines 1— 3) and the correction phase (lines 4— 12) . In line 13 the integer variable cycle is increased, which we use to count the integration cycles.
  • the rate- correction algorithm starts with the observation phase in which the actual correction values that are calculated by the TTEthernet clock synchronization algorithm are stored for each integration cycle in the observation phase. To store, we use the array drift _obs[cycle] of real values. Algorithm 6 Rate- Correction Algorithm executed by SM i
  • the observation phase completes after a configurable number of integration cycles rate_obs_nr and the correction phase starts (line 4) .
  • the intermediate correction value that the algorithm first calculates is the arithmetic mean of the individual correction values (line 5) . If the mean exceeds the configured maximum drift offset a non-faulty clock would exhibit (i.e., ax(drift)) the correction value corr is reduced to these bounds (lines 6— 10). Finally, after the correction value is calculated and bounded it is used to correct the current rate of the local clock (line 11) . Although it is not depicted in Algorithm 6, only a pre- configured percentage of the correction value may be used to correct a clock's rate.
  • a change of a clock's rate is equivalent to increasing or decreasing the number of oscillator ticks per integration cycle.
  • Fig. 8 plots the divergence of the clock times from real time as introduced in Fig. 4.
  • Clocks 1 to 3 have a positive drift, while clocks 4 and 5 have a negative drift.
  • the first two integration cycles are configured as the observation phase in which the nodes record their clock correction values.
  • the clocks calculate the corr value as specified in Algorithm 6 and adapt the rate of their clocks.
  • corr does not exceed the max (drift) and as depicted, from the third integration cycle onwards, all clocks are almost perfectly aligned.
  • Fig. 9 shows a scenario with unstable clocks and resulting changing drift rates.
  • clocks 1 to 3 have positive drift while clocks 4 and 5 have negative drift.
  • clocks 4 and 5 are the only clocks that correct their clock state.
  • the drift of the clocks changes, in a way that clocks 1 to 3 now drift in the negative direction while clocks 4 and 5 drift in the positive direction. Consequently, the correction value that clocks 4 and 5 apply adds up to the now positive drift and leads to an increase in the precision in the system.
  • Fig. 10 To formally verify properties about the rate-correction algorithm we define the system-level abstraction as depicted in Fig. 10.
  • the graph is essentially the same as for the diagnosis abstraction, however the underlying abstract states and transitions are, of course, different.
  • the abstraction consists of four abstract states SMALL, BIG, SMALL_rate, and BIG_rate.
  • SMALL and BIG represent the system during the observation phase, while SMALL_rate and BIG_rate represent the system when the clock rates are adapted.
  • the system-level abstraction very naturally reflects the algorithm phases.
  • the rate-correction algorithm is a simple means to improve the precision in a system when the drift rates of the clocks can be assumed to be stable to some degree. However, even if they are not stable the rate-correction algorithm can improve the precision if the rate-correction algorithm is executed periodically and the change of the drift is relatively slow compared to the frequency of execution of the rate-correction algorithm.
  • TTEthernet is intended as integrative network for mixed-criticality systems it may also be the case that some nodes of a network will be more affected by physical processes, like heat, than others.
  • the system architect may configure more affected nodes as synchronization clients which only passively synchronize to the TTEthernet timeline as generated by the SMs and CMs. Even further, the system architect may decide to run the rate-correction algorithm on the synchronization clients more frequently than on the SMs.
  • the location of a node within the network can also influence the design decision on how often to run the clock-rate correction algorithm.
  • Systems that are in spatial proximity to physical processes with varying temperature ranges, e.g., motor control, may have require to run the rate-correction algorithm frequently.
  • Other systems may adjust their rate only after initial synchronization.
  • the rate- correction algorithm is executed in the SMs.
  • the CMs may adjust their clock rate to the SMs as well, the formal assessment of such configurations is outside of the scope of this paper and plan to explore this behavior in future work.
  • the diagnosis algorithm follows a simple accusation protocol and aims to identify failure scenarios in which a faulty CM inconsistently distributes synchronization information. When such a faulty CM is diagnosed, then the non-faulty SMs consistently discard all synchronization information from the faulty CM. As a result of the diagnosis algorithm, the precision in the synchronized network improves and we have presented formal evidence for that.
  • the clock rate-correction algorithm is executed in each of the SMs and continually records the clock state-correction values that the TTEthernet clock synchronization protocol calculates.
  • Fig. 1 describes an Example TTEthernet network with n end systems and two redundant channels (each formed by a single switch).
  • Fig. 2 describes an overview of the TTEthernet two step clock synchronization algorithm.
  • Fig. 3 describes the progress in Real Time plotted against Clock Time.
  • Fig. 4 describes an example execution of the TTEthernet clock synchronization algorithm in presence of a faulty CM.
  • Fig. 5 describes a system-level abstraction for the formal proof.
  • Fig. 6 describes an example execution of the diagnosis algorithm as layered on top of the TTEthernet clock synchronization algorithm in presence of a faulty CM.
  • Fig. 7 describes a system-level abstraction for the formal proof of the diagnosis algorithm.
  • Fig. 8 describes a fault-free scenario of the layered rate- correction algorithm.
  • Fig. 9 describes a fault-free scenario of the layered rate-correction algorithm with highly varying clock drifts.
  • Fig. 10 describes a system-level abstraction for the formal proof of the rate-correction algorithm.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Synchronisation In Digital Transmission Systems (AREA)

Abstract

A method for a clock-rate correction in a network consisting of nodes.

Description

METHOD FOR A CLOCK-RATE CORRECTION IN A NETWORK CONSISTING OF NODES
The invention relates to a method for a clock-rate correction in a network consisting of nodes.
The work leading to this invention has received funding from the European Community's Seventh Framework Programme FP7/ 2007-2013 under grant agreement n° 236701.
Fault-tolerant clock synchronization is the foundation of synchronous architectures such as the Time-Triggered Architecture (TTA) for dependable cyber-physical systems. Clocks are typically local counters that are increased with a given rate according to real time, and clock synchronization algorithms ensure that any two clocks in the system read about the same value at about the same point in real time. This is achieved by a clock synchronization algorithm that changes the current values of the clocks, the clocks' rate, or both.
This invention discloses a clock-rate correction algorithm as layered services on top of the TTEthernet clock synchronization algorithm, which itself is a clock-state correction algorithm. Thereby, the precision in a TTEthernet system can be improved.
The rate-correction algorithm records the clock state-correction values for a configurable number of integration cycles. It then calculates an average of the corrected values and changes the rate of the clocks for a configurable percentage of this average. In any case the change of rate is bound by the maximum drift offset max(drift) from a perfect reference time.
Probably most prominently, the FlexRay communication protocol [7] specifies a rate- correction algorithm. Although our algorithm is similar to the FlexRay rate-correction approach, there are differences with respect to the underlying assumptions on topology and the clock-state correction algorithm. A combination of clock-state correction and clock-rate correction has been introduced by Kopetz et al. in [8] and analyzed by simulation and measurement. This approach elects a particular rate master, which is then used by the other nodes to align their rate to. The drawback of such an approach is, that in case of the failure of the rate master, a re-election is necessary. Our approach does not rely on a rate master and integrates state and rate correction more tightly. In detail, the invention relates to a method for a clock-rate correction in a network consisting of nodes, each node having a local clock, wherein the nodes are connected to each other in an arbitrary network topology, in which a two-step clock synchronization algorithms is realized, wherein this two-step clock synchronization algorithm comprises:
-) a first step, in which all nodes of a first subset of nodes (SM) are providing their local clock state (SM_clock) via messages to the network and wherein all nodes of a second subset of nodes (CM) are receiving these messages, and wherein the nodes of the second subset of nodes (CM) are performing a first convergence function on the local clock readings, and
-) a second step, in which the nodes of the second set of nodes (CM) transmit the result of the calculation (CM_clock) of the convergence function of the first step to the network in form of messages, and wherein nodes of a first subset of nodes (SM) and / or other nodes (SC) in the system that receive this messages from the second subset of nodes (CM) apply a second convergence function based on the timing information associated with these messages, and wherein nodes which are receiving messages from the second subset of nodes (CM) use the timing information associated with at least a subset of these messages to correct their local clocks, and wherein a node is keeping track of succeeding corrections applied to its local clock, and wherein a node changes the clock rate for a quantity that is a function of previous corrections applied to the local clock.
It is of advantage that the two-step clock synchronization algorithm is the TTEthernet clock synchronization algorithm.
Furthermore, the invention relates to a network for carrying out a method as described above, wherein the network consists of nodes, each node having a local clock, wherein the nodes are connected to each other in an arbitrary network topology, characterized in that each nodes is either an end system or a switch.
It is of advantage that the network comprises multiple tree connections, wherein each tree is formed of a disjoint subset of switches, and wherein a subset of end systems is connected to exactly one switch of each tree of the set of redundant trees.
It is of further advantage, if the subset of nodes that send messages in the first step of the two-step clock synchronization approach is a subset of the end systems.
The subset of nodes that perform the first convergence function can be a subset of the switches in the network.
Furthermore, it may be of advantage in network or a method described above that the end systems use the diagnosis information of TTEthernet as specified in the AS6802 standard to keep track on the succeeding clock correction.
Also, it may be of advantage when the clock rate of the local clock in each node can be changed only up to a preconfigured maximum.
Further details will be explained in the following:
Layered Diagnosis and Clock- Rate Correction
for the TTEthernet Clock Synchronization Protocol
Abstract
Fault-tolerant clock synchronization is the foundation of synchronous architectures such as the Time- Triggered Architecture (TTA) for dependable cyber-physical systems. Clocks are typically local counters that are increased with a given rate according to real time, and clock synchronization algorithms ensure that any two clocks in the system read about the same value at about the same point in real time. This is achieved by a clock synchronization algorithm that changes the current values of the clocks, the clocks' rate, or both.
This paper presents a diagnosis algorithm and a clock-rate correction algorithm as layered services on top of the TTEthernet clock synchronization algorithm, which itself is a clock-state correction algorithm. We analyze the algorithms' properties and explore and understand their behavior using a bounded model checker for infinite data types.
We use our formal framework for both simulation and formal proof. To the best knowledge of the authors this has been the first time that formal methods, should they be theorem provers or model checkers, have been applied to the problem of rate-correction for fault-tolerant clock synchronization. Furthermore, the formal development process itself demonstrates how efficiently pre-existing models can be utilized in the development of new algorithms and their formal verification.
Keywords: TTEthernet, fault tolerance, clock synchronization, formal verification, model- based design
1 Introduction
Dependable systems are omnipresent in our daily lives, and are becoming increasingly large and complex. As a consequence of this trend it is apparent that the correct development of such complex systems requires a sound architectural basis. In the absence of architectures we will either build systems of insufficient quality or will simply not be able to build systems beyond a certain level of complexity at all. The time-triggered architecture (TTA) [1] as developed at the Institut fiir Technische Informatik at the Vienna University of Technology is an extraordinary example of an architecture for dependable embedded systems. The TTA tremendously simplifies the development of dependable cyber-physical systems. It has been successfully applied in industries that demand a high level of determinism such as the avionics industry, in which predictability of system operation is a key property. TTP [2] and TTEthernet [3] are implementations of the TTA. TTP is applied, for example, in the new Boeing 787 Dreamliner, whereas TTEthernet has been selected for the NASA Orion Space Program. While the aerospace and space industries (as well as automotive and similar industries) are traditional areas for dependable systems, we also observe emerging areas with increasing dependability requirements. Examples include surgery robots in the medical area, datacenters in the financial and other critical industries, as well as the smart grid that aims at decentralized energy production and efficient energy use. TTEthernet is currently being evaluated for several of these emerging areas.
TTEthernet integrates synchronized and unsynchronized communication on the same physical network, i.e., time-triggered frames and event-triggered frames can coexist. The synchronized, time-triggered traffic relies on synchronized local clocks in the system and, therefore, TTEthernet specifies a fault-tolerant clock synchronization algorithm. This algorithm is a clock state-correction algorithm: the TTEthernet devices periodically exchange the current values of their local clocks and correct their clock values appropriately.
We have formally verified the TTEthernet clock synchronization algorithm and reported the verification procedure and results in [4] . This has been the first time that a fault-tolerant clock synchronization algorithm has been formally verified by model checking under the assumptions of clock drift and clock failures. The presented verification method is highly automated, which not only minimizes the probability of human error in the deduction of the correctness of the algorithm, but also works as formal framework that enables the holistic design of new algorithms running on top of TTEthernet. In this paper we present two such algorithms, a diagnosis algorithm and a clock rate- correction algorithm, as well as their design process and their formal verification. The main contribution of this paper is, thus, twofold: we introduce a new method for algorithm development and, by demonstrating the method, we present two new algorithms for TTEthernet.
The diagnosis algorithm implements a simple version of an accusation protocol as used in NASA's SPIDER protocol [5] . TTEthernet devices that detect inconsistencies in the TTEthernet clock synchronization protocol report these inconsistencies to all devices in the system. Once a sufficiently high number of devices has accused a particular device of being faulty, this device is excluded from the clock synchronization protocol. Using our formal framework, we can automatically prove the resulting quality improvement of the precision in the system. Serafmi et al. [6] introduce application-level diagnosis algorithms, for which they discuss an implementation and provide formal correctness proofs. Besides some algorithmic difference, from the point of view of formal verification Serafmi et al. prove their algorithms in a discrete time model and abstract from the underlying synchronization protocol, while our framework allows the integrated proof of the clock synchronization protocol together with diagnosis in a continuous time model.
Clock rate-correction algorithms, or rate-correction algorithms for short, not only periodically realign the values of the clock counters, but also change the rate of their increment. E.g. , a node may diagnose that it always has to correct its local clock for about +5μβ. This means that the update rate of this clock's counter is too low, or in other words, the clock ticks too slowly. In this case, a rate- correction algorithm would speed-up the clock with the aim that the next correction should be less than 5μβ. Of course, this assumes some stability of clock drift, which we discuss later in this paper.
Probably most prominently, the FlexRay communication protocol [7] specifies a rate-correction algorithm. Although our algorithm is similar to the FlexRay rate-correction approach, there are differences with respect to the underlying assumptions on topology and the clock-state correction algorithm. A combination of clock-state correction and clock-rate correction has been introduced by Kopetz et al. in [8] and analyzed by simulation and measurement. This approach elects a particular rate master, which is then used by the other nodes to align their rate to. The drawback of such an approach is, that in case of the failure of the rate master, a re-election is necessary. Our approach does not rely on a rate master and integrates state and rate correction more tightly.
We continue in Section 2 with a review of the TTEthernet clock synchronization algorithm. In Section 3 we recapture the formal proof of this algorithm and present how its formal model is re-used in our framework for simulation and formal verification of layered algorithms. Based on this framework we have developed several new algorithms. We introduce a diagnosis algorithm in Section 4 and a clock rate-correction algorithm in Section 5. We discuss these algorithms formally and give simulation results and formal proofs using our framework. Finally, we conclude in Section 6.
Due to space limitations we do not discuss the SAL models in detail. The models can be found online1 and will be referenced on the SAL wiki.
2 TTEthernet Clock Synchronization Algorithm
TTEthernet is an extension of the traditional Ethernet standard by services that guarantee deterministic delivery of time-critical messages. An example network with two redundant channels is depicted in Fig. 1. As depicted, a TTEthernet network consists of end systems and switches, where end systems are connected to switches with bi-directional communication links. Switches may connect to each other thereby forming multi-hop connections between end systems. Each switch belongs to one and only one channel and in its simplest form a channel is formed by a single switch and the communication links to the end systems. For fault- tolerance reasons, a TTEthernet network implements redundant channels, e.g., two redundant channels as in Fig. 1.
2.1 Clock Synchronization Overview
End systems and switches are physical components to which the TTEthernet clock synchronization algorithm assigns one of three "roles" , synchronization master (SM) , compression master (CM) , or synchronization client (SC) . In this paper we assume for simplicity of discussion that end systems are configured as SMs and switches as CMs. We also generally consider a network as the one depicted in Fig. 1. SMs and CMs inform each other about the current state of their local clocks
1http : //www. csl . sri . com/users/bruno/sal/layered-algorithms . tar . gz by exchanging protocol control frames (PCF). We have discussed the process of how a component concludes on the current local time of a remote component via the reception of PCFs in [9] and assume in the rest of the paper that the exchange of PCFs is equivalent to exchanging the current values of the local clocks of the components.
Fig. 2 depicts the two steps in the TTEthernet clock synchronization algorithm. In the first step, the SMs send PCFs to the CMs. The CMs extract from the arrival points in time of the PCFs the current state of their local clocks and execute a first convergence function, the so-called compression function (Alg. 1). The result of the convergence function is then delivered to the SMs in form of new PCFs (the "compressed" PCFs) . In the second step the SMs collect the compressed PCFs from the CMs and execute a second convergence function (Alg. 2,3) . The diagnosis algorithm (Alg. 4,5) and the rate-correction algorithm (Alg. 6) analyzed in this paper are then executed only after the clocks are corrected by the TTEthernet clock synchronization algorithm.
2.2 Failure Hypothesis
TTEthernet assumes an inconsistent-omission failure model for the CMs. This means that a faulty CM is able to arbitrarily accept and reject PCFs from the SMs and can also decide to which SMs it sends the compressed PCF and to which not. Babbling idiot failures of the CM are excluded by the design of the CM as self-checking pair. The SMs, on the other hand, may fail arbitrarily, and in particular, they may start to babble PCFs. The design of the CMs ensures that only one PCF per SM is used per re-synchronization cycle. However, we assume that the clock values provided by a faulty SM can be arbitrary and the faulty SM may send different clock values to the different CMs. Although TTEthernet is configurable to tolerate multiple failures, we analyze and verify the new algorithms under a single failure fault-hypothesis. Hence, we assume either a faulty SM or a faulty CM, but not both at the same point in time.
2.3 First Step Convergence: Compression Master (CM)
The CMs collect the current states of the local clocks of the SMs. We denote these values by SM-clock and number them SM -clocki, where 1 < i < \SM\ and assume that the SM-clocki values are sorted in increasing order. From the received SM _clocki , a CM j uses a variant of the fault- tolerant median to calculate the new "compressed" clock CM .clock which we number with the identifier of the respective CM: CM .clock j . Algorithm 1 defines this calculation as a function of the number of SM.clocki values (denoted by the cardinality
Figure imgf000008_0001
received.
The compressed clock is delivered back to the SMs in a new "compressed" PCF and the SMs are able to read the compressed clock value from the arrival point in time of the compressed PCF. This compressed PCF also contains the pcf.membership.new field in its payload. pcf. member ship. new is a bitvector in which each bit is assigned to a unique SM. The CMs will set the bit of a SM, if the respective SM i has provided a local clock value SM.clocki in the calculation of the most recent CM _clock j and will clear the bit otherwise. The self-checking pair design of the CM guarantees Algorithm 1 Convergence Algorithm executed by CM j
1 if \ SM_clock\ = 1 then
2 CM.clockj <- SMj ockx
3 else if \ SM _clock\ = 2 then
4 CM.dockj <- SM_dock, +SM_clock2
5 else if | S _cZoc ;| = 3 then
6 CM-clockj <- SM-clock2
7: else if | S _cZoc ;| = 4 then
8 CM-dockj <- SM_dock2+SM_clockR
9 else if | S _cZoc ;| = 5 then
10 CM.clocL <- SM_clock3
11 else
12 average of the (A: + largest and (A: + smallest clocks, where A; is the number of faulty SMs to be tolerated.
13: end if that the compressed clock CM _clockj and the pcf_membership_new vector are consistent. Hence, the design prevents a faulty CM from setting an arbitrary number of bits in pcf -member ship -new.
Alternatively, for the case of five SM clock values, the CM may calculate the arithmetic mean of the second and fourth SM clock value for the compressed clock value.
2.4 Second Step Convergence: Synchronization Master (SM)
In the second step of the clock synchronization algorithm, the SMs receive the compressed PCFs, extract the compressed clock values from them, and correct their local clocks. In the fault-free case each SM receives exactly one compressed PCF per CM from which it extracts the compressed clock values CM_clockj , where 1 < j < \ CM \ and we assume the CM _clockj values are sorted in increasing order.
In the case of a faulty CM, an SM may receive at maximum one compressed PCF per CM (as the faulty CM may decide not to send its compressed PCF to some SMs) . Furthermore, an SM will only use a compressed PCF in its convergence function if the pcf-membership-new field has at least accept_threshold of bits set. The value of accept_threshold is calculated using Algorithm 2: the SM searches for the maximum bits set in any of the PCFs received from the CMs. The value of accept_threshold is then given by this maximum minus the configured number of tolerable faulty SMs.
The SM will discard a compressed PCF that has less than accept_threshold bits set in the pcf -member ship -new field. This mechanism ensures that an SM excludes compressed PCFs that represent relatively low numbers of SM clocks. The pcf_membership_new vector is also used in other TTEthernet algorithms such as clique detection or startup as well as in network configurations that use more than one CM per channel. We do not discuss this functionality and configurations in this paper. For the analysis of the clock synchronization algorithm the description above is sufficient. Algorithm 2 select ( CM -dock)
for j = 1→ \CM\ do
if current. max < bitsum{pcf ' .membership .new ·) then
current. max <— bitsum{pcf .membershi .new ·)
end if
end for
accept. threshold <— current .max— conf .faulty. SM
return { C _c/ocA;| pcf .memberhip .new
> accept. threshold}
Under the assumption of one CM per channel and up to three channels maximum, the convergence function is described in Algorithm 3.
Algorithm 3 Convergence Algorithm executed by SM i
1 if \ select( CM.clock) \ = 1 then
2 act.corr <— SM.clocki— CM _clock\
3 SM.clocki <- CM.clock\
4 else if |se/eci( C _c/ocA;) | = 2 then
5 act.corr <- SM.clocki - CM.clock1 +CM.clock2
6 SM.clocki <- CM-clocki +CM-clockt
7: else
8 {|seZeci( C _c.ocJfc) | = 3}
9 act.corr <— SM.clocki—
Figure imgf000010_0001
10 SM.clocki <- CM.clock2
11 end if
The act.corr value is already an extension for the rate-correction algorithm discussed in Section 5. It stores the current correction value and is updated with each integration cycle.
2.5 Synchronization Theorem
The maximum difference between any two local clocks SM.clocki and SM. clock j of non-faulty SMs, SMi and SMj, is called the precision. The precision is bounded and known.
Theorem 1.
SM.clocki > SM. clock j
SM.clocki— SM .clock j < precision
Proof. Theorem 1 has been proven by model checking in [4] considering a faulty SM, a faulty CM, or both a faulty SM and a faulty CM. □ 3 Proof and Simulation Framework
Our proof and simulation framework is based on the bounded infinite-state model checker SAL (sal-inf -bmc) . We build on the representation of the TTEthernet clock synchronization algorithm presented in [4] . We treat this formal model as "holistic view" in a sense that our framework does not rely on the output of these previous studies of the TTEthernet clock synchronization algorithm, but incorporates the previous models with the new algorithms.
The algorithms presented in this paper have been formalized in SAL [10] as state-transition system of the form (S, I,→) . Here, S defines the set of system states σ¾, / the set of initial system states with / C S and→ the set of transitions between system states. Each system state σ maps the variables to particular values according to their defined variable type. Furthermore, SAL supports structured modeling such that we can define the SM and CM functionality in encapsulated modules.
SAL provides several tools (symbolic, bounded, and bounded infinite-state model checking). While we experimented with all of them, we finally use the bounded infinite-state model checker sal-inf -bmc to prove the TTEthernet synchronization quality as well as to generate testcases. With sal-inf -bmc we can treat time as a continuous entity and can use fc-induction [11] as proof method.
3.1 Representation of Time
In TTEthernet, the clock synchronization algorithm is executed in rounds, called the integration cycles. Fig. 3 shows an example scenario with a fast and a slow clock.
The x-axis depicts real time and the y-axis the internal clock time of a respective TTEthernet device. The perfect clock is plotted as a forty-five degree solid line while the fast clock is depicted as a dashed line slightly above the perfect clock and the slow clock is depicted as a dotted-dashed line slightly below the perfect clock. The figure shows for each integration cycle the divergence of the fast and slow clocks from the perfect clock and their synchronization at the beginning of each integration cycle. The drift from the perfect clock is a function of the length of the integration cycle and the drift rate of the clocks. Following literature we use the Rsync for the integration cycle and p for the drift rate. In addition, we use a value error to summarize other factors in the clock synchronization process (e.g., network jitter, inaccuracies from the clocks not perfectly executing the integration cycles at the same time) , but in general we assume that error will be a rather small factor compared to the real clock drift. Hence, we use the term drift offset, or drift for short, for the sum of deviations of a clock from the perfect clock within one integration cycle. drift = Rsync x p + Δ error x )
In TTEthernet we are interested in the precision of the non-faulty clocks, where the precision is defined as the maximum difference between any two non-faulty clocks in the system. Now, in order to determine the precision we do not need the actual clock readings, but only the sequences of their differences to the perfect clock. Fig. 4 shows this approach of modeling time to verify the precision. The a;-axis represents real time, the y-axis represents the clock time deviations from the perfect clock. In each step from even to odd x values the drift offset for an integration cycle i is added. At each even x value we see the maximum offset after the execution of the clock synchronization algorithm. Note that the a;-axis is therefore not equally spaced with respect to real time, the drift offset step simulates the drift over the integration cycle, while the execution time of the clock synchronization algorithm is only a fraction of the integration cycle length.
3.2 Simulation with SAL
The design of distributed algorithms is notoriously difficult, in particular in the case of fault-tolerant algorithms. The interactions of the components are hard to trace and even trivial inter dependencies may not be obvious when designing an algorithm on paper only. Hence, the analysis of new algorithms by means of computer aided verification and simulation becomes more and more state- of-the-art. Simulation in particular allows to explore an algorithm's behavior in a very early phase of its design.
In addition to pure simulations, the model-checker approach allows us to use "wildcards" for which the tool is free to assign non-deterministic values. Hence, instead of a single simulation run that takes as input a specific test vector and analyzes the system behavior under this test, the model checker approach systematically searches the state space for all possible evaluations for each wildcard.
As we will discuss for the following two algorithms, simulation with SAL is used to add new functionality to the TTEthernet clock synchronization algorithm and to explore its behavior.
3.3 Formal Proof with SAL
As we gain more and more trust in the design of our algorithm we also have as a goal to actually formally prove some properties of interest. This is the true power of our formal framework, while we immediately deduce information through simulation, we can almost seamlessly switch to formal verification.
The proof of a property D is done by fc-induction [11] , which is a generalized form of regular induction and consists of the following stages [12] :
• Base Case: Show that all the states reachable from / in no more than k— 1 steps satisfy P
• Induction Step: For all trajectories σο→ . . .→ σ¾ of length k, show that σο |= PA . . . Λσ¾_ι |= P ak \= P fc-induction is a powerful verification tool, but it can be directly applied only to relatively simple state-transition systems. For more complex systems we need to construct a system-level abstraction. Formally, a system-level abstraction Λ is also a state-transition system of the form (S, X,→ ) , where S is a set of abstract states∑¾ and X £ S is the initial abstract state. Furthermore, →_4 is a set of transitions between two abstract states. The system-level abstraction has to fulfill the following properties.
• Each system state σ¾ is described by at least one abstract state∑j .
• The initial abstract state X describes at least one initial system state.
• For each transition in→ that brings the system from a system state σ\ to a system state σ2 there either exists an abstract transition in→_4 such that σ\ is described by the abstract state before the abstract transition and σ2 is described by the abstract state after the abstract transition has been taken, or, if such an abstract transition does not exist, then the abstract state describing σ\ must also describe σ2·
Using the abstraction approach, the formal verification of a property DP is done in two steps. In the first step we verify that the abstraction correctly represents the model M (i.e., it satisfies the properties listed above) : M = A and in the second step we verify that the model M together with the abstraction A satisfies DP: M Λ A |= DP.
We illustrate this abstraction method using the example of the verification of the TTEthernet clock synchronization algorithm as verified in [4] . In this model each SM is described by a state machine and all state machines are executed synchronously. For simplicity, we assume that each of these state machines has only two variables, SM_state and SM_clock, where SM_state is either sync or send and SM_clock keeps track of the divergence from the perfect clock. The current system state is simply the sum of all of the current local states of the SMs.
Fig. 5 depicts a system-level abstraction for the TTEthernet that fulfills the abstraction properties listed above. In this case the abstraction is very simple and consists only of the two abstract states SMALL and BIG. BIG is an abstract state that requires all SMs to be in the sync state at the same time while in the abstract state SMALL, all SMs must be in the send state. Furthermore, precision will be bounded by some real constant FACT0R_small times ax(drift) in the SMALL abstract state and by some other real constant FACTOR times ax(drift) in the BIG state. FACT0R_small < FACTOR holds and both numbers are derived manually or by re-running the model checking until no counterexamples are produced. These numbers depend on the number and type of failures present in the system and for TTEthernet FACTOR is between 2 and 4 (hence, the precision in TTEthernet is in [2 x ax(drift), 4 x max(drift)] .
It is easy to see that all SMs are consistently either in sync or send, given that all SMs start in the same state and change state synchronously. The resulting error of this synchronous approximation is covered in the term Aerror of Equation 1. Note, as we already specify the precision in the abstraction, proving abstraction A makes the proof of the property DP trivial. This results in high verification times for A and negligible ones for DP, as reported in [4] and in the following sections for the new algorithms. We will discuss in the following two example algorithms how the basic TTEthernet clock synchronization model is updated with new system-level abstractions to derive automated and integrated formal proofs.
4 Diagnosis Algorithm
The TTEthernet clock synchronization algorithm is inherently fault-tolerant. However, the synchronization quality decreases with the number of faulty components and the severity of their failure modes. The diagnosis algorithm presented in this section aims to detect faulty TTEthernet devices, in particular faulty CMs, and remove them from the TTEthernet clock synchronization algorithm. By doing so, the failure mode of a faulty CM is transformed from an inconsistent-omission failure mode to a fail silent failure mode and we can formally verify that the diagnosis algorithm improves the precision in the system.
4.1 Algorithm Specification
The diagnosis algorithm is based on a simple accusation protocol presented by Algorithm 4 and Algorithm 5.
Algorithm 4 Diagnosis Algorithm executed by SM i
1: if CM.clockj _L then
2: active[j] <- TRUE
3: end if
4: for j = 1→ \CM\ do
5: if CM _clock j = _L Λ active [j] then
6: accused[i] [j] , A Ci. accused[j] <— TRUE
7: end if
8: end for
Algorithm 4 is executed in the SMs immediately after the clocks are corrected (see Fig. 2 on the temporal dependencies of the algorithms to each other) . It starts with each SM recording those CMs from which they receive PCFs (lines 1—3) using an array active of boolean variables. The symbol _L denotes the absence of a PCF and in case that the clock of CM j is not absent (hence, the SM received a PCF from CM j) the respective active[j] will be set to TRUE.
In the remaining lines (4— 8) an SM i checks for each CM j whether it has been active before, but it did not receive a PCF in the current integration cycle. If this is the case, SM i accuses CM j to be omission faulty. For simplicity, we assume that this accusation information is stored in a local accusation matrix accused indexed by the SMs and CMs. Furthermore, SM i informs all other SMs of its accusation by sending and accusation message AC , where A C . accused is a vector of boolean variables with each boolean representing a unique CM. SM i will set ACi .accused[j] if it accuses CM j. Again, for simplicity, we assume that the A C messages are sent as rate- constrained traffic on all redundant channels in a TTEthernet system. By the TTEthernet network it is, thus, ensured that the A C messages are delivered with a known upper bound in time and are transported over at least one non-faulty channel. We furthermore, assume that the exchange of the accusation information happens sufficiently prior to the next execution of the TTEthernet clock synchronization algorithm.
Algorithm 5 Accusation Message Reception in SM i
1: if receives (AC k) then
2: for j = 1→ \CM\ do
3: if AC k.accused[j] then
4: accused[k] [j] <- TRUE
5: end if
6: end for
7: end if
Algorithm 5 is executed by an SM i that receives an accusation message A C^ from an SM k: when a boolean variable in A Ck- accused[j] is TRUE, SM i sets the corresponding local accused to TRUE as well. Each SM, thus, uses the matrix accused to locally store all accusations from all SMs.
Once an SM k accuses a CM or receives sufficient accusations for a CM, it will stop using the PCFs from this CM. Hence, the selection function presented in Algorithm 2 will also take the accused matrix into account when returning the set of CM -dock such that all CM -clockj will be excluded that are accused by a sufficiently high number, z, of SMs. We can update line 7 in Algorithm 2 accordingly:
{ CM -dock j pcf -memberhip -new > accept -threshold Λ
CM -dock j =>· ^accused[k] [j] Λ
Figure imgf000015_0001
accused Λ . . . Λ accused [iz] [j]}
An alternative realization to modifying the selection function is the deactivation of the communication link that connects the SM to the faulty CM.
Under a single failure fault-hypothesis, as assumed in this paper, z = 2 is sufficient and necessary to compensate a faulty SM that may arbitrarily accuse CMs. In this case, an SM that receives two accusations for a CM can be certain that one of the accusations stems from a correct SM. Furthermore, as either an SM or a CM are faulty at the same point in time, a faulty SM excludes the presence of a faulty CM and, hence, even accusations of a faulty SM are distributed by all CMs consistently. 4.2 Verification Procedure and Results
We have extended the basic model of the TTEthernet clock synchronization algorithm by the functionality of the diagnosis algorithm as presented in this section. Using our formal framework we can start by simulating the diagnosis and clock synchronization algorithms together. Fig. 6 depicts such a simulation outcome of a scenario in a system of five SMs and two CMs where CM 1 is faulty in such a way that it may accept only a subset of SM _clock values. Hence, in general the compressed clocks, CM -dock j, produced by the CMs will be different.
Fig. 6 plots the divergence of the clock times from real time as described for Fig. 4. In this scenario the clocks of SM 1-3 (denoted by Clock 1-3) have positive drift of 10 time units while the clocks of SMs 4 and 5 (denoted by Clock 4 and 5) have negative drift of 10 time units. In the first integration cycle all SMs receive PCFs from both CMs. In the second integration cycle all SMs except SM 2 receive PCFs from CM 1 (at x=3). Consequently, SM 2 will correct its clock slightly differently than the remaining SMs (at x=4). As SM 2 did not receive a PCF, it accuses CM 1 and will no longer accept PCFs from CM 1. As long as CM 1 does not fail to send a PCF to one of the other SMs, SM 2 always deviates from the remaining SMs after clock correction. However, in the fifth integration cycle (at x=9) , CM 1 does not send a PCF to SM 3, which in turn also accuses CM 1. As now, SM 2 and SM 3 accuse CM 1 of being faulty, all SMs exclude CM -dock\ from clock synchronization. Finally, from the sixth integration cycle on (at x=ll) all SMs will only use CM -dock,2 for clock synchronization and the inconsistent omission failure mode of CM 1 is transformed into a fail-silence failure.
Simulation traces such as the one depicted above are valuable during the design of algorithms. However, our formal framework also allows us to formally verify the correctness of the diagnosis algorithm. For this we replace the system-level abstraction with a new one depicted in Fig. 7. As shown, the abstraction extends the one of the TTEthernet clock synchronization algorithm with two abstract system states SMALL_acc and BIG_acc. This extension is a good example of how intuitively the abstraction process is: as long as the number of accusations is insufficient, the system transitions between SMALL and BIG. Once the accusations reach the threshold the SMALL_acc and BIG_acc are entered and the SMs transition between these two abstract states. The difference between the first two abstract states and the later ones is that the precision improves in SMALL_acc and BIG_acc (after a delay of one integration cycle) .
The diagnosis algorithm ensures that once the faulty CM is detected by a sufficiently high number of SMs, the exclusion of the faulty CM improves the precision from | x max(drift) to 2 x ax(drift) .
Theorem 2.
¾ , . . . , iz : accused Λ . . . Λ accused[iz] [j]
=>· precision < 2 x ax(drift) Proof. Theorem 2 has been proven by model checking using sal-inf-bmc, fc-induction and the abstraction as depicted in Fig. 7 with the following performance characteristics {k represents the depth of the induction base and time the verification time in seconds) :
Figure imgf000017_0001
4.3 Algorithm Discussion
The simple diagnosis algorithm can be extended in several ways some of which we discuss next. However, in this paper we do not consider this extensions in our analysis as our prime focus is on the demonstration of an integrated proof when layering a diagnosis service on top of an established clock synchronization algorithm rather than analyzing all variants of the diagnosis approach.
As a first extension, the SM may accuse a CM only after a configurable number of lost PCFs per configured time-interval. This would mitigate the probability that a CM is accused because of a transient error or because of a bit error as the Ethernet frame is transported over the communication link. Secondly, for the same reasons the accusation may be reset in all SMs to a allow an accused CM to rejoin the TTEthernet clock synchronization algorithm. Lastly, the SMs may also take statistics on the number of lost application Ethernet frames into account in their determination whether to accuse a CM or to remove an accusation.
5 Rate-Correction Algorithm
In this section we present the clock rate-correction algorithm that can be implemented as a layer on top of the TTEthernet clock synchronization algorithm. The rate- correction algorithm records the clock state-correction values for a configurable number of integration cycles. It then calculates an average of the corrected values and changes the rate of the clocks for a configurable percentage of this average. In any case the change of rate is bound by the maximum drift offset ax(drift) from a perfect reference time.
5.1 Algorithm Specification
Algorithm 6 is executed in each SM after the clocks have been corrected by the TTEthernet state- correction algorithm (Alg. 2, 3 in Fig. 2) . It consists of an observation phase (lines 1— 3) and the correction phase (lines 4— 12) . In line 13 the integer variable cycle is increased, which we use to count the integration cycles.
The rate- correction algorithm starts with the observation phase in which the actual correction values that are calculated by the TTEthernet clock synchronization algorithm are stored for each integration cycle in the observation phase. To store, we use the array drift _obs[cycle] of real values. Algorithm 6 Rate- Correction Algorithm executed by SM i
1: if cycle < rate_obs_nr then
2: drift _obs[cycle] <— act_corr
3: end if
Figure imgf000018_0001
8: else if corr <— ax(drift) then
9: corr =— ax(drift)
10: end if
11 clock ^rate <— clock_rate— corr
12: end if
13: cycle <— cycle + 1
The observation phase completes after a configurable number of integration cycles rate_obs_nr and the correction phase starts (line 4) .
In the correction phase, the intermediate correction value that the algorithm first calculates is the arithmetic mean of the individual correction values (line 5) . If the mean exceeds the configured maximum drift offset a non-faulty clock would exhibit (i.e., ax(drift)) the correction value corr is reduced to these bounds (lines 6— 10). Finally, after the correction value is calculated and bounded it is used to correct the current rate of the local clock (line 11) . Although it is not depicted in Algorithm 6, only a pre- configured percentage of the correction value may be used to correct a clock's rate.
Note that the modification of a clock's rate not necessarily demands a change of the physical oscillator frequency, but rather the number of oscillator ticks per integration cycle, which is a configurable parameter, can be changed. Hence, a change of a clock's rate is equivalent to increasing or decreasing the number of oscillator ticks per integration cycle.
5.2 Verification Procedure and Results
In our formal analysis of the rate-correction algorithm we use a network of five SMs and two channels, where each channel implements exactly one CM. We have extended the formal model of the TTEthernet clock synchronization algorithm with the additional functionality described in Algorithm 6. Again, we start with some simulations to get confidence in the correctness of the design of the layered rate- correction algorithm as well as in the formal model of it. An example scenario is presented in Fig. 8.
Fig. 8 plots the divergence of the clock times from real time as introduced in Fig. 4. Clocks 1 to 3 have a positive drift, while clocks 4 and 5 have a negative drift. The first two integration cycles are configured as the observation phase in which the nodes record their clock correction values. After the second integration cycle, the clocks calculate the corr value as specified in Algorithm 6 and adapt the rate of their clocks. In the scenario of Fig. 8 corr does not exceed the max (drift) and as depicted, from the third integration cycle onwards, all clocks are almost perfectly aligned.
The scenario as discussed above is certainly idealized as it reflects certainly strong assumption, as for example perfectly stable clock drifts. In reality, this will hardly be the case. Fig. 9 shows a scenario with unstable clocks and resulting changing drift rates. Here, during the first integration cycle, clocks 1 to 3 have positive drift while clocks 4 and 5 have negative drift. Analogously to the stable drift scenario, clocks 4 and 5 are the only clocks that correct their clock state. At the end of integration cycle two clocks 4 and 5 have established corr = 2 (as they corrected +2 time units in each of the first two integration cycles). Now, the drift of the clocks changes, in a way that clocks 1 to 3 now drift in the negative direction while clocks 4 and 5 drift in the positive direction. Consequently, the correction value that clocks 4 and 5 apply adds up to the now positive drift and leads to an increase in the precision in the system.
To formally verify properties about the rate-correction algorithm we define the system-level abstraction as depicted in Fig. 10. The graph is essentially the same as for the diagnosis abstraction, however the underlying abstract states and transitions are, of course, different. The abstraction consists of four abstract states SMALL, BIG, SMALL_rate, and BIG_rate. SMALL and BIG represent the system during the observation phase, while SMALL_rate and BIG_rate represent the system when the clock rates are adapted. Again, the system-level abstraction very naturally reflects the algorithm phases.
We use this abstraction to verify the precision under the condition of unstable clock drifts. As discussed in the scenario of Fig. 9 the precision may become larger when the drift of the clocks changes in the same direction as the rates are corrected to. The general theorem is as follows: the rate-correction algorithm ensures that, even under arbitrarily changing drift rates within the specified drift range, and in presence of an inconsistent omission faulty CM, the overall precision is bound by 8/3 x 2 x max (drift) .
Theorem 3.
g
precision <— x 2 x max (drift)
Proof. Theorem 3 has been proven by model checking using sal-inf-bmc, fc-induction and the abstraction as depicted in Fig. 10 with the following performance characteristics (k represents the depth of the induction base and time the verification time in seconds) :
Figure imgf000019_0001
□ 5.3 Algorithm Discussion
The rate-correction algorithm is a simple means to improve the precision in a system when the drift rates of the clocks can be assumed to be stable to some degree. However, even if they are not stable the rate-correction algorithm can improve the precision if the rate-correction algorithm is executed periodically and the change of the drift is relatively slow compared to the frequency of execution of the rate-correction algorithm.
As TTEthernet is intended as integrative network for mixed-criticality systems it may also be the case that some nodes of a network will be more affected by physical processes, like heat, than others. In this case, the system architect may configure more affected nodes as synchronization clients which only passively synchronize to the TTEthernet timeline as generated by the SMs and CMs. Even further, the system architect may decide to run the rate-correction algorithm on the synchronization clients more frequently than on the SMs.
The location of a node within the network can also influence the design decision on how often to run the clock-rate correction algorithm. Systems that are in spatial proximity to physical processes with varying temperature ranges, e.g., motor control, may have require to run the rate-correction algorithm frequently. Other systems may adjust their rate only after initial synchronization.
The detailed discussion of these architectural decisions is outside of the scope of this paper and we target at an evaluation in the context of a real-world application.
The rate- correction algorithm is executed in the SMs. Although, the CMs may adjust their clock rate to the SMs as well, the formal assessment of such configurations is outside of the scope of this paper and plan to explore this behavior in future work.
6 Conclusion
This paper has introduced new distributed algorithms which can be implemented as layers on top of the TTEthernet clock synchronization protocol. We have presented a diagnosis algorithm and a clock rate-correction algorithm. The diagnosis algorithm follows a simple accusation protocol and aims to identify failure scenarios in which a faulty CM inconsistently distributes synchronization information. When such a faulty CM is diagnosed, then the non-faulty SMs consistently discard all synchronization information from the faulty CM. As a result of the diagnosis algorithm, the precision in the synchronized network improves and we have presented formal evidence for that. The clock rate-correction algorithm is executed in each of the SMs and continually records the clock state-correction values that the TTEthernet clock synchronization protocol calculates. After a configurable number of consecutive measurements all SMs use these measurements to adapt the rates of their clocks. By simulation we have shown, that the precision in the synchronized network improves and we have formally verified that the precision is still bounded when the clock drifts change arbitrarily within given bounds. For our studies we have developed a novel simulation and verification framework that allows the integrated verification of algorithms such as the ones discussed above together with the underlying clock synchronization protocol. This formal framework is based on a model checker for infinite data types, which allows to realistically model real-time clocks. Furthermore, the framework enables push-button proofs that reduces the overhead of human deduction in the verification process to finding good system-level abstractions, which we have shown follows quite naturally from an algorithm's behavior.
References
[1] H. Kopetz and G. Bauer, "The time-triggered architecture," Proceedings of the IEEE, vol. 91, no. 1, pp. 112 - 126, Jan. 2003.
[2] H. Kopetz, TTP/C Protocol - Version 1.0. Vienna, Austria: TTTech Computertechnik AG, Jul. 2002, Available at http://www.ttagroup.org.
[3] W. Steiner, TTEthernet Specification, TTA Group, 2008, Available at http: / / www.ttagroup.org.
[4] W. Steiner and B. Dutertre, "Automated formal verification of the ttethernet synchronization quality," in NASA Formal Methods, ser. Lecture Notes in Computer Science, M. Bobaru, K. Havelund, G. Holzmann, and R. Joshi, Eds. Springer Berlin / Heidelberg, 2011, vol. 6617, pp. 375-390.
[5] W. Torres- Pomales, M. R. Malekpour, and P. Miner, ROBUS-2: A Fault- Tolerant Broadcast Communication System. Hampton, Virginia, USA: Langley Research Center, 2005.
[6] M. Serafini, P. Bokor, N. Suri, J. Vinter, A. Ademaj, W. Brandstatter, F. Tagliabo, and J. Koch, "Application-level diagnostic and membership protocols for generic time-triggered systems," IEEE Trans. Dependable Sec. Comput., vol. 8, no. 2, pp. 177-193, 2011.
[7] FlexRay Communications System - Protocol Specification - Version 2.1. FlexRay Consortium, 2005, Available at http://www.flexray.com.
[8] H. Kopetz, A. Ademaj, and A. Hanzlik, "Combination of clock-state and clock-rate correction in fault-tolerant distributed systems," Real- Time Systems, vol. 33, no. 1-3, pp. 139-173, 2006.
[9] W. Steiner and B. Dutertre, "SMT-Based formal verification of a TTEthernet synchronization function," in Formal Methods for Industrial Critical Systems, ser. Lecture Notes in Computer Science, S. Kowalewski and M. Roveri, Eds., vol. 6371. Springer- Verlag, 2010, pp. 148-163.
[10] L. de Moura, S. Owre, H. Ruefi, J. Rushby, N. Shankar, M. Sorea, and A. Tiwari, "Tool presentation: SAL2," in Computer- Aided Verification (CAV 2004), S. Verlag, Ed., 2004. [11] L. de Moura, H. Ruefi, and M. Sorea, "Bounded model checking and induction: From refutation to verification," in Computer- Aided Verification, CAV 2003, ser. Lecture Notes in Computer Science, A. Voronkov, Ed., vol. 2725. Springer- Verlag, 2003, pp. 14-26.
[12] B. Dutertre and M. Sorea, "Modeling and verification of a fault-tolerant real-time startup protocol using calendar automata," in Proc. of FORMATS/FTRTFT, ser. Lecture Notes in Computer Science, Y. Lakhnech and S. Yovine, Eds., vol. 3253. Springer- Verlag, Sep. 2004, pp. 199-214.
Description of Figures
Fig. 1 describes an Example TTEthernet network with n end systems and two redundant channels (each formed by a single switch).
Fig. 2 describes an overview of the TTEthernet two step clock synchronization algorithm.
Fig. 3 describes the progress in Real Time plotted against Clock Time.
Fig. 4 describes an example execution of the TTEthernet clock synchronization algorithm in presence of a faulty CM.
Fig. 5 describes a system-level abstraction for the formal proof.
Fig. 6 describes an example execution of the diagnosis algorithm as layered on top of the TTEthernet clock synchronization algorithm in presence of a faulty CM.
Fig. 7 describes a system-level abstraction for the formal proof of the diagnosis algorithm.
Fig. 8 describes a fault-free scenario of the layered rate- correction algorithm.
Fig. 9 describes a fault-free scenario of the layered rate-correction algorithm with highly varying clock drifts.
Fig. 10 describes a system-level abstraction for the formal proof of the rate-correction algorithm.

Claims

\. Method for a clock-rate correction in a network consisting of nodes, each node having a local clock, wherein the nodes are connected to each other in an arbitrary network topology, in which a two-step clock synchronization algorithms is realized, wherein this two-step clock synchronization algorithm comprises:
-) a first step, in which all nodes of a first subset of nodes (SM) are providing their local clock state (SM_clock) via messages to the network and wherein all nodes of a second subset of nodes (CM) are receiving these messages, and wherein the nodes of the second subset of nodes (CM) are performing a first convergence function on the local clock readings, and
-) a second step, in which the nodes of the second set of nodes (CM) transmit the result of the calculation (CM_clock) of the convergence function of the first step to the network in form of messages, and wherein nodes of a first subset of nodes (SM) and / or other nodes (SC) in the system that receive this messages from the second subset of nodes (CM) apply a second convergence function based on the timing information associated with these messages, and wherein nodes which are receiving messages from the second subset of nodes (CM) use the timing information associated with at least a subset of these messages to correct their local clocks, and wherein a node is keeping track of succeeding corrections applied to its local clock, and wherein a node changes the clock rate for a quantity that is a function of previous corrections applied to the local clock.
2. Method according to claim 1, wherein the two-step clock synchronization algorithm is the TTEthernet clock synchronization algorithm.
3. Network for carrying out a method according to claim 1 or 2, wherein the network consists of nodes, each node having a local clock, wherein the nodes are connected to each other in an arbitrary network topology, characterized in that each nodes is either an end system or a switch.
4. Network according to claim 3, wherein the network comprises multiple tree connections, wherein each tree is formed of a disjoint subset of switches, and wherein a subset of end systems is connected to exactly one switch of each tree of the set of redundant trees.
5. Network according to claim 3 or 4, wherein the subset of nodes that send messages in the first step of the two-step clock synchronization approach is a subset of the end systems.
6. Network according one of the claims 3 to 5, wherein the subset of nodes that perform the first convergence function is a subset of the switches in the network.
7. Network and Method according to one of the claims 1 to 6, wherein the end systems use the diagnosis information of TTEthernet as specified in the AS6802 standard to keep track on the succeeding clock correction.
8. Network and Method according claim 1 to 7, wherein the clock rate of the local clock in each node can be changed only up to a preconfigured maximum.
PCT/AT2012/050130 2011-09-29 2012-09-06 Method for a clock-rate correction in a network consisting of nodes WO2013044281A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
EP12765979.5A EP2761794A1 (en) 2011-09-29 2012-09-06 Method for a clock-rate correction in a network consisting of nodes

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
AT14182011 2011-09-29
ATA1418/2011 2011-09-29

Publications (1)

Publication Number Publication Date
WO2013044281A1 true WO2013044281A1 (en) 2013-04-04

Family

ID=46934349

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/AT2012/050130 WO2013044281A1 (en) 2011-09-29 2012-09-06 Method for a clock-rate correction in a network consisting of nodes

Country Status (2)

Country Link
EP (1) EP2761794A1 (en)
WO (1) WO2013044281A1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104009893A (en) * 2014-06-16 2014-08-27 北京航空航天大学 Method suitable for monitoring inside compression master and capable of improving clock synchronization fault tolerance
WO2015031926A1 (en) * 2013-09-04 2015-03-12 Fts Computertechnik Gmbh Method for transmitting messages in a computer network and computer network
WO2016184369A1 (en) * 2015-05-15 2016-11-24 华为技术有限公司 Method for configuring clock tracking and control device
CN112491491A (en) * 2020-12-14 2021-03-12 深圳安捷丽新技术有限公司 Clock synchronization method, device, storage medium and system based on Ethernet cable
CN112583836A (en) * 2020-12-15 2021-03-30 昆高新芯微电子(江苏)有限公司 Time-triggered Ethernet time synchronization method, equipment and system

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11394612B2 (en) 2019-09-16 2022-07-19 Toyota Motor Engineering & Manufacturing North America, Inc. Distributed systems and extracting configurations for edge servers using driving scenario awareness
CN111585683B (en) * 2020-05-11 2021-11-23 上海交通大学 High-reliability clock synchronization system and method for time-sensitive network

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1280024A1 (en) * 2001-07-26 2003-01-29 Motorola Inc. Clock synchronization in a distributed system
US20080089363A1 (en) * 2006-10-13 2008-04-17 Honeywell International Inc. Clock-state correction and/or clock-rate correction using relative drift-rate measurements

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1280024A1 (en) * 2001-07-26 2003-01-29 Motorola Inc. Clock synchronization in a distributed system
US20080089363A1 (en) * 2006-10-13 2008-04-17 Honeywell International Inc. Clock-state correction and/or clock-rate correction using relative drift-rate measurements

Non-Patent Citations (17)

* Cited by examiner, † Cited by third party
Title
"FlexRay Communications System - Protocol Specification", 2005, FLEXRAY CONSORTIUM
ASTRIT ADEMAJ, HERMANN KOPETZ, PETR GRILLINGER, KLAUS STEINHAMMER, ALEXANDER HANZLIK: "Fault-Tolerant Time-Triggered Ethernet Configuration with Star Topology", 20 August 2010 (2010-08-20), www.vmars.tuwien.ac.at, XP002688667, Retrieved from the Internet <URL:http://www.sciweavers.org/publications/fault-tolerant-time-triggered-ethernet-configuration-star-topology> [retrieved on 20121203] *
B. DUTERTRE; M. SOREA: "Proc. of FORMATS/FTRTFT, ser. Lecture Notes in Computer Science", vol. 3253, September 2004, SPRINGER-VERLAG, article "Modeling and verification of a fault-tolerant real-time startup protocol using calendar automata", pages: 199 - 214
GE FANUC INTELLIGENT PLATFORMS. INFORMATION CENTERS: "TTEthernet - A Powerful NetworkSolution for AdvancedIntegrated Systems", 11 August 2009 (2009-08-11), GE Fanuc Intelligent Platforms, XP002688666, Retrieved from the Internet <URL:http://www.ge-ip.com/userfiles/file/TTNet%20WP_gft751.pdf> [retrieved on 20121203] *
H. KOPETZ: "TTP/C Protocol - Version 1.0.", July 2002, TTTECH COMPUTERTECHNIK AG
H. KOPETZ; A. ADEMAJ; A. HANZLIK: "Combination of clock-state and clock-rate correction in fault-tolerant distributed systems", REAL-TIME SYSTEMS, vol. 33, 2006, pages 139 - 173, XP019409400, DOI: doi:10.1007/s11241-006-6885-9
H. KOPETZ; G. BAUER: "The time-triggered architecture", PROCEEDINGS OF THE IEEE, vol. 91, no. 1, January 2003 (2003-01-01), pages 112 - 126, XP011065101
HERMANN KOPETZ: "Real-Time Systems. Design Principles for Distributed Embedded Applications Second Edition", REAL-TIME SYSTEM SERIES, January 2011 (2011-01-01), Springer Science+Business Media, pages 66 - 73, XP002688665, ISSN: 1867-321X, ISBN: 978-1-4419-8236-0, Retrieved from the Internet <URL:https://vowi.fsinf.at/images/temp/2/2c/20110606133809!TU_Wien-Echtzeitsysteme_VO_%28Kopetz%29_-_TU_Wien-Echtzeitsysteme_VO_%28Kopetz%29_-_TU_Wien-Echtzeitsysteme_VO_%28Kopetz%29_-_Real_Time_Systems_-_Design_Principles_for_Distributed_Embedded_Applications_--_Hermann_Kopetz_--_2._Edition.pdf> [retrieved on 20121203], DOI: 10.1007/978-1-4419-8237-7 *
L. DE MOURA; H. RUESS; M. SOREA: "Computer-Aided Verification, CAV 2003, ser. Lecture Notes in Computer Science", vol. 2725, 2003, SPRINGER-VERLAG, article "Bounded model checking and induction: From refutation to verification", pages: 14 - 26
L. DE MOURA; S. OWRE; H. RUEF3; J. RUSHBY; N. SHANKAR; M. SOREA; A. TIWARI: "Computer-Aided Verification (CAV 2004", 2004, S. VERLAG, ED., article "Tool presentation: SAL2"
M. SERAFINI; P. BOKOR; N. SURI; J. VINTER; A. ADEMAJ; W. BRANDSTATTER; F. TAGLIABO; J. KOCH: "Application-level diagnostic and membership protocols for generic time-triggered systems", IEEE TRANS. DEPENDABLE SEC. COMPUT., vol. 8, no. 2, 2011, pages 177 - 193, XP011342508, DOI: doi:10.1109/TDSC.2010.23
See also references of EP2761794A1 *
W. STEINER: "TTEthernet Specification", 2008, TTA GROUP
W. STEINER; B. DUTERTRE: "Formal Methods for Industrial Critical Systems, ser. Lecture Notes in Computer Science", vol. 6371, 2010, SPRINGER-VERLAG, article "SMT-Based formal verification of a TTEthernet synchronization function", pages: 148 - 163
W. STEINER; B. DUTERTRE: "NASA Formal Methods, ser. Lecture Notes in Computer Science", vol. 6617, 2011, SPRINGER, article "Automated formal verification of the ttethernet synchronization quality", pages: 375 - 390
W. STEINER; B. DUTERTRE: "NASA Formal Methods, ser. Lecture Notes in Computer Science", vol. 6617, part III April 2011, SPRINGER, ISBN: 978-3-642-20397-8, article "Automated formal verification of the ttethernet synchronization quality", pages: 375 - 390, XP002688664, DOI: 10.1007/978-3-642-20398-5_27 *
W. TORRES-POMALES; M. R. MALEKPOUR; P. MINER: "ROBUS-2: A Fault-Tolerant Broadcast Communication System", 2005, LANGLEY RESEARCH CENTER

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015031926A1 (en) * 2013-09-04 2015-03-12 Fts Computertechnik Gmbh Method for transmitting messages in a computer network and computer network
CN104009893A (en) * 2014-06-16 2014-08-27 北京航空航天大学 Method suitable for monitoring inside compression master and capable of improving clock synchronization fault tolerance
WO2016184369A1 (en) * 2015-05-15 2016-11-24 华为技术有限公司 Method for configuring clock tracking and control device
CN112491491A (en) * 2020-12-14 2021-03-12 深圳安捷丽新技术有限公司 Clock synchronization method, device, storage medium and system based on Ethernet cable
CN112583836A (en) * 2020-12-15 2021-03-30 昆高新芯微电子(江苏)有限公司 Time-triggered Ethernet time synchronization method, equipment and system

Also Published As

Publication number Publication date
EP2761794A1 (en) 2014-08-06

Similar Documents

Publication Publication Date Title
WO2013044281A1 (en) Method for a clock-rate correction in a network consisting of nodes
Steiner et al. Automated formal verification of the TTEthernet synchronization quality
Arvind Probabilistic clock synchronization in distributed systems
EP3185481B1 (en) A host-to-host test scheme for periodic parameters transmission in synchronous ttp systems
EP1900127B1 (en) Safe start-up of a network
US7979730B2 (en) Method and device for synchronizing cycle time of a plurality of TTCAN buses based on determined global time deviations and a corresponding bus system
Rushby An overview of formal verification for the time-triggered architecture
US10025344B2 (en) Self-stabilizing distributed symmetric-fault tolerant synchronization protocol
Steiner et al. SMT-Based formal verification of a TTEthernet synchronization function
Steiner et al. The TTEthernet synchronisation protocols and their formal verification
Steiner et al. Layered diagnosis and clock-rate correction for the ttethernet clock synchronization protocol
Johansson et al. Heartbeat bully: failure detection and redundancy role selection for network-centric controller
EP2761795B1 (en) Method for diagnosis of failures in a network
Pfluegl et al. A new and improved algorithm for fault-tolerant clock synchronization
Ammar et al. Formal verification of Time-Triggered Ethernet protocol using PRISM model checker
Sheena et al. A review on formal verification of basic algorithms in time triggered architecture
Pfeifer Formal methods in the automotive domain: The case of TTA
Malekpour A self-stabilizing hybrid fault-tolerant synchronization protocol
Barroso-Fernández et al. Optimizing Gossiping for Asynchronous Fault-Prone IoT Networks with Memory and Battery Constraints
Peng et al. A Distributed TSN Time Synchronization Algorithm with Increased Tolerance for Failure Scenarios
Tang et al. Safe Clock Synchronization Mechanism for Multi-Cluster TTEthernet Networks
Sinha et al. Modular composition of redundancy management protocols in distributed systems: An outlook on simplifying protocol level formal specification and verification
CN114978926B (en) Simulation method and equipment suitable for deterministic network protocol
Godary et al. Temporal bounds for TTA: Validation
Azim et al. Resolving state inconsistency in distributed fault-tolerant real-time dynamic tdma architectures

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 12765979

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2012765979

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: DE