PCI Express reflective memory
Multicast made easy
Dolphins PCI Express Reflective / reflected memory / Mulitcaset solution create a reflective memory address space between nodes interconnected with PCI Express over cable or backplanes. The solution offers significantly higher performance at a much lower cost than other reflective memory solutions and is available both with low cost copper and with long distance fiber as the interconnect. Mixing fiber and copper is fully supported.
Broadcast implemented in hardware
Reflective memory systems (in computer literature also referred to as mirror memory systems, replicated shared memory, multicast or replicated memory systems) implements transparent and automatic updates of remote shared memory areas. Reflective memory is typically mapped into an embedded system application and enables similar applications on other nodes to share updated data without involving any traditional networking protocol and overhead. Data of any size is transmitted to all nodes directly by functionality implemented in hardware.
Typical applications can range from a two-node fail over pair to large DSM applications like aircraft, ship and submarine simulators, automated testing systems, industrial automation and high-speed data acquisition. Because of their inherent replication they are especially good for fault tolerance.
Dolphin PCI Express hardware
The reflective memory functionality is implemented in the Dolphin PCI Express IX switches. Switches can be cascaded if you need to build larger systems. You can build a 2 node system without the switch and use the regular data unicast (data only written to single remote memory location, no local updates). All hosts needs to install a Dolphin Express IX adapter card. All nodes can write to the replicated memory simultaneously.
Benefits provided by PCI Express multicast
The PCI Express based reflective memory solutions provides significant benefits over alternative solutions:
- Data in main memory: The Dolphin Express IX reflective memory solutions utilize main memory to store data. This has several significant benefits:
- Reading data in main memory is significantly faster than solutions storing data in specialized PCIe device memory located in the computer IO system.
- Main memory is cached: This means that the solution will benefit from the standard CPU cache when reading data. Reflective memory updates from remote will automatically invalidate the CPU cache and ensure full data consistency.
- Specialized device memory is normally very expensive vs main memory modules.
- You don’t need to specify the reflective memory size when buying hardware. The size of Dolphin Express IX reflective memory is user configurable – a property set by the application during initialization of the system.
- Data is multicast by a centralized switch.
- Each IXS600 switch will send data out on all connected ports simultaneously. This means that all nodes will receive data virtually simultaneously when connected to a single switch. When multiple switches are used, each switch hop will add less than 200 nanoseconds delay to the distribution of the data.
- Alternative solutions using a ring topology to distribute data have significant delays between when the first and the last node in the network receives the data. Each node will typically introduce a fixed delay; the total delay in the network varies depending on the number of nodes.
- The minimal delay introduced by Dolphin Express IX reflective memory enables real-time applications to benefit from a significantly reduced total communication time – allowing the application to run at a faster simulation frequency or spend more time on computation.
- Dead nodes or unplugging cables will not stop the entire network; all nodes that remain connected to the network will be able to communicate without interruption.
- Hardware based CRC and retransmission. PCI Express implements a reliable data transmission by calculating a CRC for every data packet. Correctable link errors will automatically cause a hardware retransmit.
- Fair arbitration and sharing of bandwidth. Hard real-time systems should normally be configured to avoid narrow bottlenecks in the network. PCI Express uses a fair, round robin allocation of resources and provides a very deterministic data transmission even under maximum load.
Significantly faster than alternative solutions
The Dolphin IXH adapter comes with a x8 PCI Express link enabling customer applications to take advantage of the exceptional 40Gb/s link bandwidth.
Figure: Dolphin PCI Express IX Multicast throughput.
The real measured performance of Dolphin PCI Express reflective memory and other alternatives are listed in the table below.
|Adapter||Bandwidth||One way Latency||Reference|
|Dolphin IXH610 x8 - Gen2 - 3 nodes||2650 MBytes/s||0.99us||Dolphin Lab|
|Dolphin IXH610 x8 - Gen2 - 8 nodes||2650 MBytes/s||1.27us||Dolphin Lab|
|GE Fanuc PCI-5579||13.4 MBytes/s||NA||www.gefanuc.com|
|GE Fanuc PCIE-5565RC||170 MBytes/s||NA||www.gefanuc.com|
The latency above is the actual half round trip time measured by two reflective memory applications running a ping pong test.
Transmitting data to Dolphin Reflective Memory
Data can be transmitted to the reflective memory buffers using different techniques:
PIO data transmission:
Data can be sent to reflective memory using one or more CPU posted write instructions. Using SISCI, applications can use the standard memcopy() using the reflective memory as a target or do a regular pointer assignment to transmit data. The fully hardware based memory mapped data transmission does not rely on any operating system service or kernel driver functionality and provides the best possible deterministic data transmission latency and jitter.
DMA data transmission
The Dolphin Express IX adapter card includes an efficient scatter / gather DMA engine that can be engaged to send small or larger amounts of data to reflective memory. This functionality is available with the DIS 4.4.4 or newer software release from Dolphin.
Directly from PCIe device
Application programmers can use the SISCI API to configure and enable GPUs, FPGAs etc. (any PCIe master device) to send data directly to reflective memory. (Avoiding the need to first store the data in local memory).
Deterministic data communication
The fully hardware based memory mapped data transmission does not rely on any operating system service or kernel driver functionality. The solution comes with an extensive software library that makes configuration and setup easy but this software is not in active use during application runtime.
Dolphin recommends the use of RTX (Windows based) from IntervalZero or RedHawk Linux available from Concurrent or other CPU shielding techniques / real time operating systems as platform for implementing hard real-time applications in combination with Dolphin Express reflective memory.
The interconnect is highly reliable and based on the standard PCI Express protocols including hardware based 32 bit CRC check. The software library comes with functionality to detect PCI Express protocol failures, remote power failures, disconnected cables etc needed to implement a 100% reliable system for ruggedized unfriendly environments.
The Dolphin Reflective Memory solution is included in the SISCI Developers Kit. The Reflective Memory functionality is available for Linux, RTX and Windows. VxWorks is optional, please contact Dolphin if this is needed by your project.
Please e-mail firstname.lastname@example.org for more information and to get the reflective memory enabled drivers.
Additional informaiton can be found in the Dolphin Express Reflective Memory whitepaper.