Tuesday, June 26, 2007

Research on the Hypertransport Technology

HT Signals

The HT signals can be grouped into two broad categories

  • The link signal group — used to transfer packets in both directions (High-Speed Signals).

  • The support signal group — that provides required resources such as power and reset, as well as other signals to support optional features such power management (Low-Speed Signals).


Link Packet Transfer Signals

The high-speed signals used for packet transfer in both directions across an HT link include:

  • CAD (command, address, data). Multiplexed signals that carry control packets (request, response, information) and data packets. Note that the width of the CAD bus is scalable from 2-bits to 32-bits.

  • CLK (clock). Source-synchronous clock for CAD and CTL signals. A separate clock signal is required for each byte lane supported by the link. Thus, the number of CLK signals required is directly proportional to the number of bytes that can be transferred across the link at one time.

  • CTL (control). Indicates whether a control packet or data packet is currently being delivered via the CAD signals.


Link Support Signals

The low-speed link support signals consist of power- and initialization-related signals and power management signals. Power- and initialization-related signals include:

  • VLDT & Ground — The 1.2 volt supply that powers HT drivers and receivers

  • PWROK — Indicates to devices residing in the HT fabric that power and clock are stable.

  • RESET# — Used to reset and initialize the HT interface within devices and perhaps their internal logic (device specific).

  • Power management signals

    • LDTREQ# — Requests re-enabling links for normal operation.

    • LDTSTOP# — Enables and disables links during system state transitions.


Scalable Performance

The width of the transmit and receive portion of the link (CAD signals) may be different. For example, devices that typically send most of their data to main memory (upstream) and receive limited data from the host can implement a wide path in the high performance direction and narrow path for traffic in the lesser used direction, thereby reducing cost.

The HyperTransport link combines the advantages of both serial and parallel bus architectures. HT provides options for the number of data paths implemented and for the clock rate at which data is transferred; thus, providing scalable link performance ranging from 0.2GB/s to 12.8GB/s. This scalability is helpful to system designers. For example:

  • An implementation that needs all the available bandwidth (e.g. system chipsets), can use wide links (up to 32 bits), running at the highest clock frequencies (up to 800MHz now and 1GHz in the future).

  • Implementations that don't require high bandwidth but do require low power may use narrow links (as few as 2 bits) and lower frequencies (down to 200MHz).


HyperTransport lends itself to scaling well because:

  • The high frequency bus translates to fewer pins required to transfer a specific amount of data. The same protocol is used regardless of link width.

  • Differential signaling results in a very low current path to ground, thereby reducing the number of power and ground pins required for devices.

  • Each additional byte lane added has its own source synchronous clock.

  • HT's implementation of ACPI compliant power management and interrupt signaling is message based, reducing pin count. Note that only two additional signals, LDTSTOP# and LDTREQ#, are required for managing power.


Clock Speeds

HyperTransport clock speeds currently supported are 200MHz, 300MHz, 400MHz, 500MHz, 600MHz, and 800MHz. Note that 700MHz is not supported. Both rising edge and falling edges of the clock are used to clock signals. The clocking mechanism is referred to as double data rate (DDR) clocking. DDR clocking translates to an effective clock frequency that is double the actual clock frequency. In addition, because each link is dual simplex, the actual link bandwidth is quadrupled when compared to the clock rate.

  • 800MHz clock with DDR = effective clock of 1,600MHz/s (1.6GTransfers/s)

  • 1.6GTransfers/s x 4 bytes = 6.4GB/s

  • 6.4GB/s in both directions = 12.8GB/s.


Based on point-to-point links, a HyperTransport chain may be extended into a fabric, using single and multi-link devices together. Devices defined for HT include:

  • Single HT link "cave" devices used to implement a peripheral function

  • Single or multi-link Bridges; (HT-to-HT, or HT to one or more other protocols such as PCI, PCI-X, AGP or Infiniband)

  • Multi-link Tunnel devices used to implement a function and extend a link to a neighboring device downstream, thus creating a chain

No comments: