A Performance Study of Two-Phase I/O

Phillip M. Dickens and Rajeev Thakur

Abstract
assively parallel computers are increasingly being used to solvelarge, I/O intensive applications in many different fields. For suchapplications, the I/O subsystem represents a significantobsitcal in the way of acheiving good performance. While massivelyparallel architectures do, in general, provide a parallel I/Osubsystem, this is not sufficient to guarantee good performance.The problem stems from the fact that in many applications each processorinitiates many small I/O requests rather than fewer larger requests,resulting in significant performance penalties due to the highlatency associated with I/O. However, it is often the case that{\it in the aggregate} the I/O requests are significantly fewerand larger. Two-phase I/O is a technique that captures and exploitsthis aggregte information to recombine I/O requests such that fewerand larger requests are generated, reducing latency and improving performance. While many results have been presented showing excellent resultsusing two-phase I/O, there has been much less discussionof the {\it implementation} issues and decisions that may affectthe performance of this algorithm. In this paper, we describe ourefforts to obtain excellent performance using two-phase I/O. In particular,we describe our first implementation which produced a sustained bandwidth of78 MBytes per second, and discuss the stepsrequired to increase this bandwidth to 420 MBytes per second.Further, we investigated theadditional improvement in performance that can be obtained using threadsto overlap computation and the two-phase I/O operation.
Contact
Phillip M. Dickens Phillip M. Dickens,Department of Computer Science,Illinois Institute of Technology,Stuart Building, Room 235C,10 West 31st Street,Chicago, Illinois 60616, dickens@homer.mcs.anl.gov