Tuesday, June 26, 2007

Clock Initialization in HT Technology

The receive FIFO in each device must be able to absorb timing differences between the transmit and receive clocks. Data is written into the FIFO in the transmit clock domain and read in the receive clock domain.

The design and operation of this FIFO must account for the dynamic variations in phase between the transmit clock domain (Tx Clock Out) and the receive clock domain (Rx Clock). The FIFO depth must be large enough to store all transmitted data until it has been safely read into the receive clock domain. The separation from the write pointer to which the FIFO data is written and the read pointer from which the FIFO location is read (write-to-read separation) must be large enough to ensure the FIFO location can be read into the receive clock domain.

The deassertion of the incoming CTL/CAD signals across a rising CLK edge is used in the transmit clock domain within each receiver to initialize the write (load) pointer. The same deassertion CTL and CAD signals is read from the FIFO synchronous to the receive clock domain and used to initialize the read (unload) pointer. The separation between the write and read pointers is calculated based on worst-case variation between the transmit and receive clocks.

Note also that CTL cannot be used to initialize the pointers for byte lanes other than 0 in a multi-byte link, because CTL only exists within the byte 0 transmit clock domain.



Synchronous Clock Mode

The specification requires that all HT devices support the synchronous clock mode. This mode is the least complicated method of transferring data from transmitter to receiver. Synchronous clock mode requires that the transmit clock and receive clock have the same source, and operate at the same frequency. If we were to assume that the transmit clock and the receive clock always remained synchronized, then a simple clocking interface could be used as described in the following example.

A Conceptual Example

In this synchronous example, the transmit clock (Tx Clock) and receive clock (Rx Clock) are presumed to be in synchronization. Note, however, that source synchronous clocking requires that Transmit Clock Out (Tx Clk Out) be 90° phase shifted from Tx Clock. In this example all other sources of transmit to receive clock variation are ignored, including the expected clock drift associated with PLLs.

The transmitter delivers data synchronously across the link using the transmit clock. Tx Clock Out is sourced later and lags the data by 90° (or one-half bit time), thereby centering the clock edge in the middle of the valid data interval. When the data arrives at the receiver it is clocked into the FIFO using Tx Clock Out. Note that the clocked FIFO has two entries, which provides a separation of 1 between Tx Clock Out and Rx Clock. Data written into the FIFO during clock 1 would not be read from the FIFO using Rx Clock until clock 2. This one entry separation (called write-to-read separation) permits time for the sample to be stored prior to being read (i.e. the FIFO entry is not being written to and read from in the same clock cycle). In short, two FIFO entries are sufficient to provide the separation needed to ensure that data is safely stored and transferred into the receive clock domain.

However, in the real world many factors contribute to timing differences between the transmit and receive clock that are potentially significant, even though the clocks originate from the same source. These real world perturbations result in somewhat more complicated implementations that must account for and manage the worst case variation between the transmit and receive clocks. Specifically, the specification describes the receive FIFO implementation for handling the variation between the transmit and receive clocks.

Sources of Transmit and Receive Clock Variance

The specification defines and details the sources of transmit and receive clock variation that can exist. These clock differences can create FIFO overflow or underflow if not identified and taken into account. The clock differences can be attributed to two different categories or sources:

  • Invariant sources — components that represent a constant phase shift between the transmit and receive clock domain.

  • Variant sources — dynamic variations in the transmit and receive time domain (these phase variations can occur even though both transmit and receive clock are running at the same frequency).

The sources of clock variation in some cases can accumulate over time, causing clock variation to increase over time. However, all of the sources of clock variation are naturally limited in terms of the maximum amount of change that can occur. For example, a PLL is designed to produce an output clock that is synchronized with the input source clock, but with certain limitations. That is, variation of output frequency is specified not to change beyond a certain phase shift. The time over which the clock phase may change can be relatively short or perhaps much longer depending upon conditions. The consideration and assessment of the sources of clock variance is done to determine a FIFO size that can absorb the worst-case clock variation. This would occur if all sources of clock variation simultaneously reach their extremes, a very unlikely circumstance.

This chapter discusses the variant and invariant sources of transmit clock to receive clock variance. It also provides an example timing budget for each source.

Invariant Sources

The time-invariant factors contribute a small proportion of the overall clock variance. The invariant factors include:

  1. Cross-byte skew in multi-byte link implementations

  2. Sampling Error

Cross-byte skew in multi-byte link implementations

Differences in the arrival of Tx Clock Out at the receiver (CLKIN) between each byte lane is caused by path length mismatch. This constant skew is termed Tbytelaneconst in the specification. The specification allows up to 1000ps for this skew. Consequently, when multiple bytes are clocked into the FIFO the maximum skew could result in one of the bytes being clocked into the FIFO 1000ps later than the associated bytes. Thus, when the associated bytes are clocked out of the FIFO by Rx Clock, one byte having arrived late may be left behind. This problem is solved by adding additional entries in the FIFOs to handle the maximum lane-to-lane skew, ensuring that all associated bytes are clocked out at the same time. Note that lane-to-lane skew may change due to the effects of temperature, voltage change, etc. This parameter called Tbytelanevar is included in the variant source list.

Sampling Error

Uncertainty in read pointer due to CTL sampling error in the receive clock domain (1 device specific Rx Clock bit time). The specification does not specifically define the source of this sampling error, but is likely caused by phase variations between the Tx Clock Out and Rx Clock that could cause a sample to be missed. Adding an additional bit time solves this problem.

Variant Sources

The phase difference between the transmit and receive clock may change significantly due to dynamic factors such as:

  1. Reference Clock Distribution Skew.

  2. PLL Variation in Transmitter and Receiver.

  3. Transmitter and Link Transfer Variation

  4. Receiver Transfer Variation

  5. Dynamic Cross Byte Lane Variation

All time variant parameters must be considered in terms of their worst-case variance. The total dynamic phase variation due to these factors is called Tvariant. Additionally, the transmit clock could either LEAD the receive clock by Tvariant or it could LAG the receive clock by Tvariant. Consequently, the receive FIFO must be sized to accommodate both phase variations.

Reference Clock Distribution Skew

Synchronous clock mode requires that the input reference clocks to the transmitter and receiver be derived from the same time base. The distribution of the reference clock to the transmitter and the receiver results in skew between the two reference clocks. This is due to:

  • differences in the output skew of the clock source, including phase error associated with Spread Spectrum Clocking in the reference clock generator, and the skew associated with the mismatch in the distribution path.

  • differences in the distribution of the clocks to their PLLs due primarily to temperature and voltage changes.

This skew results in phase difference between the Transmit and Receive Clocks and must be included in the Tvariant calculation.

PLL Variation in Transmitter and Receiver

The largest contribution to the overall Tx Clock to Rx Clock variance comes from the PLLs. The PLL is constantly making adjustments to the output frequency as a result of a feedback loop. In addition, voltage and temperature changes also add to the possible output clock variation. The sample timing budget included within the specification allows a maximum PLL output phase variation of 3500ps. This represents >1 bit time at the 400 MT/s rate and approximately 5.6 bit times at the 1600MT/s rate.

Transmitter and Link Transfer Variation

The transmitter clock error (accumulated over a single bit time), the transmitter PHY, and the interconnect contribute small amounts of phase error into the link transfer clock domain through all of the parameters included in the link transfer timing. This includes noise on the PCB that affects both the clock and data in the same way causing a minor shift in frequency or phase of clock and data. (Note that if the noise affected the clock and data differently, this would affect the maximum bit transfer rate due to potential violations of TSU and THD).

Receiver Transfer Variation

The receiver contributes small amounts of phase error in the received CLKIN due to distribution effects.

Write-to-Read and Read-to-Write Separation

Recall that the FIFO depth must be large enough to store all transmitted data until it has been safely read into the receive clock domain. The separation from the write pointer location where data is written and the read pointer location from which data is read must be large enough to ensure the FIFO location can be read safely into the receive clock domain.

To accommodate this clock variance in this example, the read pointer within the FIFO would need to be separated from the write pointer by 8 entries (or, bit times). The following three scenarios are provided to explain the operation of the FIFO and its pointers.

Stage A — the write pointer has progressed from entry 0 to entry 8. Because the separation between the write and read pointer is 8, Rx Clock is prevented from clocking data from the FIFO until the separation reaches 8. At this stage, the separation has just been reached, so Rx Clock clocks data from entry 0, while the Tx Clock Out clocks data into entry 8.

Stage B — the write pointer has progressed to entry 15 and because there is still no phase difference between Tx Clock Out and Rx Clock the separation between the pointers remains at 8. Rx Clock is clocking data from entry 7 as Tx Clock Out is clocking data into entry 15.

Stage C — the write pointer has rolled from entry 15 back to entry 0 while the read pointer has advanced to entry 8. This simply illustrates that the separation is still maintained when the write pointer reaches the end of the FIFO and wraps back to entry 0.

Scenario 3: Rx Clock Lags Tx Clock Out

This scenario presents the opposite condition that was illustrated in scenario 2. In this example, the receive clock lags the transmit clock. As in the previous example, the phase difference between the clocks would not likely accumulate so quickly.

Stage A — the write pointer has previously traversed all of the entries and is back at entry 0 again, while the read pointer is at entry 8 This scenario focuses on the possibility that the Rx Clock lags the Tx Clock Out clock. In this case, the read-to-write separation becomes critical. In stage A this separation is 8.

Stage B — the write pointer has advanced to entry 13, while the read pointer has only advanced to entry 15. The write pointer had moved ahead by 13 entries and the read pointer has moved only 7 entries, leaving a read-to-write separation of only 2.

Once again, the large change in clock variance over such a short period of time as illustrated in stage B would not occur. But the example does serve to illustration that over time the clock variance can accumulate and that an appropriately sized FIFO will be able to absorb the clock variance without overflow.

Pseudo-Synchronous Clock Mode

In pseudo-synchronous mode, both Rx Clk in the receiver device and Tx Clk in the transmitter device are generated from the same time base clock just as in the synchronous mode case. During initialization, software configures each link to the maximum common frequency based on the values reported in each device's frequency capability register. The highest frequency supported by both devices is loaded into the Link Frequency register of each device. This value defines the highest frequency that both devices can use when sending packets over the link. In synchronous implementations this would be the exact frequency used by both devices. However, a device implementing pseudo-synchronous mode may arbitrarily lower the transmit clock frequency (Tx Clk or Tx Clock Out) below that specified by the Link Frequency register. Note that the receiver clock (Rx Clk) still runs at the frequency specified by the Link Frequency register.

No comments: