Name: Efficient Task Graph Scheduling for Parallel QR Factorization in SLSQP
Keywords: FOS: Computer and information sciences, Computer Science - Distributed, Parallel, and Cluster Computing, Distributed, Parallel, and Cluster Computing (cs.DC)

descriptionPublicationkeyboard_double_arrow_right Part of book or chapter of book , Article , Preprint 22 Aug 2025Embargo end date: 01 Jan 2025 English Publisher:Springer Nature Switzerland

Authors: Chatterjee, Soumyajit; Utkoor, Rahul; Eshwar, Uppu; Peri, Sathya; Nandivada, V. Krishna;

doi: 10.1007/978-3-031-99872-0_15 , 10.48550/arxiv.2506.09463

arXiv: 2506.09463

Efficient Task Graph Scheduling for Parallel QR Factorization in SLSQP

- Summary
- Subjects
- Metrics

Abstract

Efficient task scheduling is paramount in parallel programming on multi-core architectures, where tasks are fundamental computational units. QR factorization is a critical sub-routine in Sequential Least Squares Quadratic Programming (SLSQP) for solving non-linear programming (NLP) problems. QR factorization decomposes a matrix into an orthogonal matrix Q and an upper triangular matrix R, which are essential for solving systems of linear equations arising from optimization problems. SLSQP uses an in-place version of QR factorization, which requires storing intermediate results for the next steps of the algorithm. Although DAG-based approaches for QR factorization are prevalent in the literature, they often lack control over the intermediate kernel results, providing only the final output matrices Q and R. This limitation is particularly challenging in SLSQP, where intermediate results of QR factorization are crucial for back-substitution logic at each iteration. Our work introduces novel scheduling techniques using a two-queue approach to execute the QR factorization kernel effectively. This approach, implemented in high-level C++ programming language, facilitates compiler optimizations and allows storing intermediate results required by back-substitution logic. Empirical evaluations demonstrate substantial performance gains, including a 10x improvement over the sequential QR version of the SLSQP algorithm.

Related Organizations

Indian Institute of Technology Madras
India

Keywords

FOS: Computer and information sciences, Computer Science - Distributed, Parallel, and Cluster Computing, Distributed, Parallel, and Cluster Computing (cs.DC)

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

Average

Green