
It has been well studied that reliable multicast enables consistency protocols, including Byzantine Fault Tolerant protocols, for distributed systems. However, no transport-layer reliable multicast is used today due to limitations with existing switch fabrics and transport-layer protocols. In this paper, we introduce a layer-4 (L4) transport based on remote direct memory access (RDMA) datagram to achieve reliable multicast over a shared optical medium. By connecting a cluster of networking nodes using a passive optical cross-connect fabric enhanced with wavelength division multiplexing, all messages are broadcast to all nodes. This mechanism enables consistency in a distributed system to be maintained at a low latency cost. By further utilizing RDMA datagram as the L4 protocol, we have achieved a low-enough message loss-ratio (better than one in 68 billion) to make a simple Negative Acknowledge (NACK)-based L4 multicast practical to deploy. To our knowledge, it is the first multicast architecture able to demonstrate such low message loss-ratio. Furthermore, with this reliable multicast transport, end-to-end latencies of eight microseconds or less (< 8us) have been routinely achieved using an enhanced software RDMA implementation on a variety of commodity 10G Ethernet network adapters.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 2 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
