
arXiv: 1812.00904
This paper presents an efficient technique for matrix-vector and vector-transpose-matrix multiplication in distributed-memory parallel computing environments, where the matrices are unstructured, sparse, and have a substantially larger number of columns than rows or vice versa. Our method allows for parallel I/O, does not require extensive preprocessing, and has the same communication complexity as matrix-vector multiplies with column or row partitioning. Our implementation of the method uses MPI. We partition the matrix by individual nonzero elements, rather than by row or column, and use an "overlapped" vector representation that is matched to the matrix. The transpose multiplies use matrix-specific MPI communicators and reductions that we show can be set up in an efficient manner. The proposed technique achieves a good work per processor balance even if some of the columns are dense, while keeping communication costs relatively low.
8 pages, IEEE format
FOS: Computer and information sciences, Computer Science - Distributed, Parallel, and Cluster Computing, Computer Science - Mathematical Software, Distributed, Parallel, and Cluster Computing (cs.DC), Mathematical Software (cs.MS)
FOS: Computer and information sciences, Computer Science - Distributed, Parallel, and Cluster Computing, Computer Science - Mathematical Software, Distributed, Parallel, and Cluster Computing (cs.DC), Mathematical Software (cs.MS)
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
