
Katajainen and Pasanen (1992) proved that stable 0-1 sorting can be done in O(n) time using O(1) extra space. However, the inter-block sorting procedure they rely on (Algorithm B of Munro et al., 1990) is notoriously complex, and the space analysis of Algorithm C leaves the counter packing argument implicit. I present a new algorithm that replaces the complex block-level structure with a simple iterative scheme: block size grows geometrically across two phases (first using a single word for counter packing, then using the extracted buffer), each iteration homogenizing blocks into pure-0 or pure-1 blocks, sorting them by type, and merging. The control flow is straightforward, and all space accounting is explicit and implementable.
