software . 2020

JAMPI: Efficient matrix multiplication in Spark using Barrier Execution Mode

Földi Tamás; Nicolas A Perez; Chris von Csefalvay;
Open Source
  • Published: 25 Jun 2020
  • Publisher: Zenodo
Abstract
The new barrier mode in Apache Spark allows embedding distributed deep learning training as a Spark stage to simplify the distributed training workflow. In Spark, a task in a stage doesn’t depend on any other tasks in the same stage, and hence it can be scheduled independently. However, several algorithms require more sophisticated inter-task communications, similar to the MPI paradigm. By combining distributed message passing (using asynchronous network IO), OpenJDK's new auto-vectorization and Spark's barrier execution mode, we can add non-map/reduce based algorithms, such as Cannon's distributed matrix multiplication to Spark. We document an...
Any information missing or wrong?Report an Issue