The RoX® (Ring Of Switches) Bus Architecture

A high performance and seamless expansion pathway
from Fast Ethernet to Gigabit Ethernet systems

Download the entire white paper.

RoX Architectural Objectives:

When Allayer decided to create an entirely new architecture for the Allayer Ethernet switch set, the engineering and marketing team set the following objectives:

  • A scalable inter-chip architecture that could economically address the high performance Ethernet switch market especially the "pizza box" type application. A typical configuration of this switch type would be 24 ports of Fast Ethernet with 2 Gigabit Ethernet ports.
  • The architecture must address the low cost unmanaged applications as well as managed applications.
  • Support for WAN access
  • Gigabit switch application support
  • Future support for Layer3 switching upgrades
  • Cost effective at all configuration and port sizes
  • Supportive of stackable system designs
  • Technology friendly (in printed circuit design and semiconductor process)

    When the first RoX bus devices were introduced in early 1998, the concept of the RoX bus was quickly proven in the field with numerous systems using the AL100 as the basis of an expandable Fast Ethernet switching family. With the recent introduction of the RoX-compatible AL1000 Gigabit Switching IC, the architecture has now been shown to be up to the rest of the task: it is now a high performance, scalable inter-chip architecture that can support Ethernet, Fast Ethernet and Gigabit Ethernet switching systems in all configurations.

    Architectural Analysis

    Why RoX? A simple analysis of real-world issues in the design of high performance and scalable switching systems shows some of the pluses and minuses of the various commercially available (and soon to be available) approaches.

    Shared Memory

    Shared memory approaches depend upon a central switch engine to provide high-speed interconnection to all ports. It must examine each packet to determine its routing. Unfortunately this approach requires a very high memory bandwidth and a fairly high overhead even for small systems. Worst of all, as the number of ports increase, the cost of the central memory becomes prohibitive because the memory must get larger and much faster. The disadvantages include:

  • High fixed cost on switching core

    Shared memory switch designs require a large amount of memory. In addition, the switch core must be fast enough to handle additional port traffic as number of ports increased. The initial low port count of expandable systems have to include the expensive high bandwidth memory system to accommodate future possible port increases. This makes the cost of the switch core a fairly high fixed cost and thus, it is not cost effective at a low port count.

  • Memory bandwidth bottlenecks

    The memory bandwidth requirement grows rapidly as the number of ports increase. At a high port count, the bandwidth requirement becomes unreasonably high.

    Thus, because of economic considerations and performance issues, Allayer eliminated the shared memory architecture.

    Cross Bar

    Crossbar switching has been a high performance methodology for switching that is widely used in telecom applications. It has very good performance for point to point switching (traditional telecom functionality). However, it is not well suited for the more flexible Ethernet switching that includes point to point, broadcast, and multicast switching. While it is possible to design specific cross bar ICs that may perform well in a low-port count switch configuration (8-port switches for example), the drawback is that cross-point is simply not a scalable architecture. When a hybrid cross bar is created using additional cross bar elements to link cross bar switches, this adds cost and complexity and often a new bottleneck ­ at the cross bar link.

    The cross bar makes a direct, point-to-point connection between each connected port. In straightforward port to port unicast transmission, this provides a high performance link. However, in real-life network applications, broadcast or multicast transmissions must be made fairly often, for example when the exact address of the remote location is not known. This fairly typical situation causes severe problems for a pure cross bar design.

    For example, in a four-port system, when Port A is transmitting to Port D, Ports B and C cannot transmit to D. They must wait. If Port A needs to multicast to all ports, it will cause all the other ports to queue up and wait. This situation can seriously compromise potential bandwidth.

    The disadvantages of cross bar designs can be summarized as follows:

  • Good performance on unicast but high complexity for multicast/broadcast
  • Complex implementation for asymmetric speed links
  • A cross bar is a fixed cost and is not easily scalable

    Hybrid Cross Bar

    To overcome the disadvantage of the cross bar, some hybrid cross bar strategies have been proposed. Such hybrid implementations use one link of the cross bar to connect it to another cross bar. However, just like using a switch port to link to another switch, the bottleneck becomes the interconnect link. This type of stacking actually eliminates the performance advantages of cross bar switching and, in fact, cannot really be called true switching stacking.

    When a fast 1 Gbps cross bar is connected to another fast crossbar the bottleneck becomes the link between cross bars. If there is multicast or broadcast traffic from Switch A, for example, it will have to support traffic from A to all of the other connected switches. Traffic will quickly backup at the inter cross bar link.

    Because of the high cost, degraded performance in supporting typical Ethernet broadcast/multicast traffic, and its limitations in stacking applications, Allayer choose not to use cross bar or hybrid crossbar switching.

    The Allayer Approach

    Instead of cross bar or hybrid cross bar, taking a page from high performance processor architectures and various high-end ring networking architectures, Allayer decided to create a unique architecture that incorporates the benefits of device standardization, easy scalability, and high performance in the real-world application.

    It is called the RoX (Ring of Switches) bus architecture. It supports up to four switch engines per ring and allows the use of mixed speed switches in the same ring. Expansion beyond four engines is made possible by using the 1.2 Gigabit AL1000 Switching IC as a link between multiple rings.

    A ring architecture aggregates bandwidth by nature. As ports are added, bandwidth is also added. For example, Two RoX switch chips have a 4.8 Gbps bandwidth while four RoX switch chips have a 9.6 Gbps bandwidth.

    With the RoX bus as a switching fabric, the Allayer Switch ICs (including the AL100, AL101, AL300 and AL1000) can be used to configure a wide range of high performance Fast Ethernet, Gigabit Ethernet and hybrid switches in both managed and unmanaged configurations.

    RoX Bus Overview

    The RoX bus is a 32-bit parallel (64-bit in and out per chip) interface with a 75 MHz clock. The data rate is 2.4 Gbit/s in each direction (in and out of each switch IC) and 4.8 Gbit/s in both directions. With four chips, the total aggregated bandwidth is 9.6 Gbit/s. Each of the devices on the ring has its own switch engine. A separate control bus handles status, routing, and flow control information, freeing up the data bus for data transmission.

    One major advantage in ring architecture is the economic advantage. A bus architecture is much less expensive than separate cross bar elements. Buses are easy to implement and inexpensive to expand.

    Bandwidth Aggregates Resulting in Higher Performance

    Just as important is the advantage gained in performance due to aggregating bandwidth through the ring. As a system is expanded, the ring bandwidth actually aggregates. With one AL100 switch IC, the available bandwidth is two times 2.4 Gigabit/s or 4.8 Gigabit/sec. The overall ring aggregated bandwidth is 9.6 Gbps. Note that not all traffic goes to the ring. Most traffic will only use or two nodes of the ring bandwidth with gives the ring higher effective traffic bandwidth. Another important benefit of the ring is its inherent ability to handle broadcast traffic well. Broadcast traffic goes on the ring once without being duplicated for each port as in a crossbar architecture. This makes it very efficient handling broadcast intensive LAN traffic.

    Conclusion

    The Allayer RoX bus is a very high performance Ethernet switch architecture. It is the most flexible and most economical switching architecture available and it provides many system level benefits without sacrificing scalability. Among its benefits are:

  • It provides a field-proven architecture for implementing a wide variety of managed and unmanaged switches.
  • It supports low-cost stackable switches with no performance degradation.
  • It permits the upgrade of existing systems to Gigabit Ethernet systems.
  • It includes options for interfacing to proprietary ASIC solutions.
  • It supports future Layer 3 switching capabilities.
  • It supports ATM/ASDL links.
  • It is available today!

    Download the full version of the Rox Bus White Paper.


  • Company    News    Products    Contact    Support   Careers    Optical Technologies    Enterprise Technologies  

    Allayer Communications, 107 Bonaventura Drive, San Jose, CA 95134
    408-570-0888  Email webmaster at:
    webmaster@allayer.com
    Trademarks/Copyrights © 2000 Allayer Communications