Millions of Computer Hours Gained Through 'Rolling Upgrades'
09.16.09
A new approach to scheduled downtime for the Pleiades supercomputer environment means an added 2.5 million production computing hours for NAS users.
The NAS facility's Supercomputing Systems Team (SST) recently completed the first "rolling upgrade" to the Pleiades supercomputer's operating systems, InfiniBand network software, and filesystem. The rolling
upgrade approach, developed by SST, involves first validating new software versions on a small test cluster, and then running a variety of tests and batch user applications.
By integrating the new approach into the Pleiades environment's batch processing software, computational nodes can be upgraded once a batch job finishes, and then start running new batch jobs within a few minutes,
eliminating the need to take the entire Pleiades system down.
This new approach means NASA's scientific and engineering users can continue using Pleiades without interruption. More importantly, it gives them an added 2.5 million hours of production computing time per year
(assuming 4 upgrades per year, with an average of 8 hours of dedicated time).
The time gained is equivalent to adding a month of computing for Space Operations Mission Directorate users, or more than a week for Science Mission Directorate users.
The biggest benefit of the rolling upgrade approach is that users will experience no interruption in service during system upgrades—moving Pleiades closer to the goal of providing continuous computing.
Pleiades, currently the fourth most powerful general-purpose supercomputer in the world, delivers over 300 million hours of computational time per year to users across all four NASA mission directorates.
For more information about this activity, please contact:
Bob Ciotti, Robert.B.Ciotti@nasa.gov
For information about NASA and agency programs on the Web, visit http://www.nasa.gov/home/