PCI Express Reflective Memory
Multicast made easy
Dolphin's PCI Express reflective / reflected memory / multicast solution creates a reflective memory address space between nodes interconnected with PCI Express over cable or backplanes. This solution offers significantly higher performance at a much lower cost than other reflective memory solutions.
Broadcast implemented in hardware
Reflective memory systems (in computer literature also referred to as mirror memory systems, replicated shared memory, multicast or replicated memory systems) implement transparent and automatic updates of remote shared memory areas. Reflective memory is typically mapped into an embedded system application and enables similar applications on other nodes to share updated data without involving any traditional networking protocol and overhead. Data of any size is transmitted to all nodes directly by functionality implemented in hardware.
Typical applications can range from a two-node fail over pair to large DSM applications like aircraft, ship and submarine simulators, automated testing systems, industrial automation and high-speed data acquisition. Because of their inherent replication they are especially good for fault tolerance.
Dolphin PCI Express hardware
The multicast functionality was introduced by the PCI-SIG PCI Express Base Specification 2.1 and is available with most modern PCI Express chipsets. The multicast functionality is available with all Dolphin PCI Express NTB products built using a central PCI Express switch. You can also build a 2 node system without a switch and use the regular data unicast (data only written to single remote memory location, no local updates). All hosts needs to install a Dolphin Express NTB enables adapter cards. All nodes can write to the replicated memory simultaneously. The scalability depends on the PCIe technology used:
MXS824 PCIe switch
PXH830 or MXH830 adapter cards can be used with MXS824 switches to build large scalable reflective memory systems. Please contact Dolphin for the latest scalability and configuration information.
IXS600 PCIe Switch
Using the IXH610 and IXH620 adapter cards with up to 9 IXS600 switches, you can scale to 56 nodes. Using the PXH810 card and two IXS600 switches, you can scale to 14 nodes.
Benefits provided by PCI Express multicast
The PCI Express based reflective memory solutions provides significant benefits over alternative solutions:
- Data in main memory – the PCI Express reflective memory solutions utilize main memory to store data. This has several significant benefits:
- Reading data in main memory is significantly faster than solutions storing data in specialized PCIe device memory located in the computer IO system.
- Main memory is cached: This means that the solution will benefit from the standard CPU cache when reading data. Reflective memory updates from remote will automatically invalidate the CPU cache and ensure full data consistency.
- Specialized device memory is normally very expensive vs. main memory modules.
- There's no need to specify the reflective memory size when buying hardware. The size of PCI Express reflective memory is user configurable – a property set by the application during initialization of the system.
- Data is multicast by a centralized switch fabric.
- Each PCI Express switch will send data out on all connected ports simultaneously. This means that all nodes will receive data virtually simultaneously when connected to a single switch. When multiple switches are used, each switch hop will add less than 200 nanoseconds delay to the distribution of the data.
- Alternative solutions using a ring topology to distribute data have significant delays between when the first and the last node in the network receives the data. Each node will typically introduce a fixed delay; the total delay in the network varies depending on the number of nodes.
- The minimal delay introduced by PCI Express reflective memory enables real-time applications to benefit from a significantly reduced total communication time – allowing the application to run at a faster simulation frequency or spend more time on computation.
- Dead nodes or unplugging cables will not stop the entire network; all nodes that remain connected to the network will be able to communicate without interruption.
- Hardware based CRC and retransmission – PCI Express implements a reliable data transmission by calculating a CRC for every data packet. Correctable link errors will automatically cause a hardware retransmit.
- Fair arbitration and sharing of bandwidth – Hard real-time systems should normally be configured to avoid narrow bottlenecks in the network. PCI Express uses a fair, round robin allocation of resources and provides a very deterministic data transmission even under maximum load.
Transmitting data to Dolphin Reflective Memory
Data can be transmitted to the reflective memory buffers using different techniques:
PIO data transmission
Data can be sent to reflective memory using one or more CPU posted write instructions. Using SISCI, applications can use the standard memcopy() using the reflective memory as a target or do a regular pointer assignment to transmit data. The fully hardware based memory mapped data transmission does not rely on any operating system service or kernel driver functionality and provides the best possible deterministic data transmission latency and jitter.
DMA data transmission
The Dolphin Express IX and PX adapter card includes an efficient scatter / gather DMA engine that can be engaged to send small or larger amounts of data to reflective memory. This functionality is available with the eXpressWare 4.4.4 or newer.
Directly from PCIe device
Application programmers can use the SISCI API to configure and enable GPUs, FPGAs etc. (any PCIe master device) to send data directly to reflective memory. (Avoiding the need to first store the data in local memory).
Deterministic data communication
The fully hardware based memory mapped data transmission does not rely on any operating system service or kernel driver functionality. The solution comes with an extensive software library that makes configuration and setup easy but this software is not in active use during application runtime.
Dolphin recommends the use of RTX (Windows based) from IntervalZero or RedHawk Linux available from Concurrent-REAL-TIME or other CPU shielding techniques / real time operating systems as platform for implementing hard real-time applications in combination with Dolphin Express reflective memory.
The interconnect is highly reliable and based on the standard PCI Express protocols including hardware based 32 bit CRC check. The software library comes with functionality to detect PCI Express protocol failures, remote power failures, disconnected cables etc needed to implement a 100% reliable system for ruggedized unfriendly environments.
The Dolphin Reflective Memory solution is included in the SISCI Developers Kit. The Reflective Memory functionality is available for Linux, VxWorks, RTX and Windows and included in the standard eXpressWare software suite.
Please e-mail email@example.com for more information.
Additional information can be found in the Dolphin Express Reflective Memory white paper.
PCI Express Multicast throughput
|PXH810 - 8 nodes||5314 MBytes/s||0.99us||Dolphin Lab|
|IXH610 - 3 nodes||2650 MBytes/s||0.99us||Dolphin Lab|
|IXH610 - 8 nodes||2650 MBytes/s||1.27us||Dolphin Lab|
|GE Fanuc PCI-5579||13.4 MBytes/s||NA||www.gefanuc.com|
|GE Fanuc PCIE-5565RC||170 MBytes/s||NA||www.gefanuc.com|
The latency above is the actual half round trip time measured by two reflective memory applications running a ping pong test.