NAS Parallel Benchmarks Changes
Modifications are listed in order of most recent versions.
NPB 3.3.1
This is a bug fixing release of NPB 3.3.
- Bug fixes
- Fixed a non-portable way of broadcasting input parameters in MPI/FT
- Fixed access to out-of-bound array elements in OMP/DC and SER/DC
- Fixed use of uninitialized array in OMP/UA and SER/UA
- Other changes
- Code clean up in MPI/LU: avoid using MPI_ANY_SOURCE and delete unused codes
- Additional timers are included for most benchmarks in the MPI versions and for MG,UA in the OMP and SER versions
- Executables produced for OMP and SER now use ".x" as an extension
NPB 3.3.1-MZ
- Fixed a missing argument in calling MPI_Abort() in the MPI
versions
- Implemented a simple block distribution of zones to MPI processes for MPI/SP-MZ and MPI/LU-MZ. This might improve performance of the two benchmarks with equal-size zones. The old bin-packing scheme can still be selected with setting the environment variable NPB_MZ_BLOAD to 2.
NPB 3.3
- New and Improvements
- The Class E problem has been introduced in seven of the benchmarks (BT, SP, LU, CG, MG, FT, and EP) to stress larger size parallel computers.
- The Class D problem has been added to the IS benchmark in all three implementations. It requires the compiler support of 64-bit "long" type in C. The MPI version of IS now allows runs up to 1024 processes.
- The Bucket Sort option (USE_BUCKETS) has been added to the OpenMP version of IS and made as the default.
- Introduction of the "twiddle" array in the OpenMP FT benchmark to improve performance.
- Merge the vector codes for the BT and LU benchmarks into this release.
- Updates to BTIO (MPI/BT with IO subtypes):
- Added I/O stats (I/O timing, data size written, I/O data rate).
- Added an option for interleaving reads between writes through the inputbt.data file. Although the data file size would be smaller as a result, the total amount of data written is still the same.
- Bug Fixes
- Fixed a verification failure in MPI/FT for cases where NX/=NY and the 2D decomposition are used.
- Fixed an output printing format problem in MPI/FT occurred when the number of processes >= 1000.
- Fixed a performance regression in MPI/SP due to improper padding of array dimensions.
- Fixed invalid access of zero-length array elements in DC, made the benchmark output consistent with other NPBs.
- Fixed a data race in OMP/UA.
- Fixed missing variable declarations for the Bucket sort option in SER/IS.
- Fixed use of an uninitialized variable in MPI/sys/setparams.c.
- Others
- The default value for collbuf_nodes in the BT I/O benchmark is now set to 0, indicating no file hints will be used. The setting can be changed by using the "inputbt.data" file.
- The hyperplane version of LU (LU-HP) is no longer included in the distribution. Download NPB3.2.1 if needed.
NPB 3.3-MZ
- Inclusion of the nested OpenMP implementation (NPB3.3-MZ-OMP) to replace the SMP+OpenMP implementation.
- Merge vector codes into the standard distribution. The vector codes can be selected with the VERSION=VEC option during compilation.
- Performance improvement to SP-MZ.
NPB 3.2.1
This is a bug fixing release of NPB 3.2.
- MPI Version Changes
- Removed a duplicated statement for writing FT parameters.
- Included the I/O verification for BT SUBTYPE=EPIO.
- Fixed wrong data type used for communicating integers in LU.
- Fixed an incorrect calculation of parameter "nr" for MG, which caused run-time failure for NPROCS >= 512.
- Fixed a problem of mismatching message tags in MG.
- OpenMP Version Changes
- Use THREADPRIVATE for working array storage "in EP".
- Fixed a data flush bug for pipeline in LU.
- Reduced stack space usage in IS.
- Use locks for atomic updates in UA.
NPB 3.2-MZ
- Introduced two new classes of problem: E and F.
- Added "SCHEDULE(STATIC)" to the OpenMP layer of hybrid versions.
- Minor optimization to the communication buffers in MPI versions.
- Fixed a data flush bug for pipeline in LU-MZ.
NPB 3.2
- DC version in NPB 3.2-SER was converted to C from C++ (Classes S, W, A, B). sys/setparams.c file was changed appropriately.
- OpenMP version of DC was added to NPB3.2-OMP.
- Data Traffic benchmark DT was added to NPB3.2-MPI.
GridNPB 3.1
- Introducing Globus version of computational-grid benchmarks
- Grid NPB 3.0 release (11.19.02)
GridNPB 3.0
- Introducing computational-grid benchmarks
NPB 3.1
General:
- Used relative instead of absolute error for MG and CG verification.
- Made improvements and fixed bug in MG:
- Redefined Class W: {(64x64x64),40 iters} -> {(128x128x128), 4 iters}.
- Fixed incorrect verification values for Classes A and C.
- Added dummy iteration before time step loop in LU for consistency with other benchmarks.
- Fixed race in 'make suite'.
- Fixed dependence on make.def for files in subdirectory 'common'.
- Optimized BT memory usage.
- Merged NPB2.4-MPI into NPB3.1.
NPB 3.1-MPI
- Fixed bug in CG for running on large numbers of processes.
NPB 3.1-SER and NPB 3.1-OpenMP
- Added Unstructured Adaptive (UA) benchmark, measuring the effect of irregular, continually changing memory accesses.
- Added serial version of Data Cube (DC) benchmark, based on a data mining application.
- Added Class D.
- Improved LU and LU-HP: reduced memory use for 'tv' variable, inproved memory access for variables 'a,b,c,d' in LU-HP.
- Cleaned up matrix initialization (makea) and reduced memory use.
- Added 'LU-HP' as a valid benchmark option in makefiles.
NPB 3.1-OpenMP Only
- Included hyper-plane version of LU benchmark: LU-HP.
- Removed need for dummy 'omp_lib_dum' library for compilation without an OpenMP compiler.
- Parallelized initialization part of MG.
- Parallelized matrix initialization (makea) in CG.
- Cleaned up SP to make structure more consistent with serial version.
NPB 3.1 - MZ
- Allow variable number of threads for individual processes and define processor group for each process. Versions affected: all parallel versions
- Report total number of threads (instead of threads-per-process). Versions affected: all parallel versions
- Use accurate surface term in MFLOPS calculation to take into account non-square zone faces. Versions affected: all
- Improvements for LU-MZ:
- Added one SSOR iteration before timing loop to touch all data and code.
- Improved the memory usage for array 'tv.'
- Made rhs and erhs more cache-friendly.
- Versions affected: all
- Print built-in timers for all processes Versions affected: all parallel versions
NPB 2.4
- Added BT I/O benchmark.
- Added Class D for all benchmarks, except IS.
- Changed initialization in FT to avoid integer overflow in Class D.
- Reduced FT memory size by removing the "ex" table used in the "evolve" routine.
NPB 2.3
- Added CG.
- Defined Class W for small memory systems (under 32MB).
- Released serial versions consistent with parallel code (08/18/97).
NPB 2.2
- Added IS in C.
- Modified FT so that it can be run on a number of processors larger than the last array dimension.
- Added EP.
- Encoded CLASS C size and verification numbers.
- Removed include file mpifrag.f containing executable statements.
- Added different versions of random number generators.
Open Source Software
Open Source for NASA means enhanced software quality through community review and development, enhanced collaboration through sharing of NASA-originated software, and more efficient and effective dissemination of research products (such as software) to the public.
As part of the effort to create an Open Source option, NASA formed a cross-agency legal team - this team created the NASA Open Source Agreement (NOSA) for Open Source releases.
NOSA is endorsed by the Open Source Initiative, and is the chief overseer of NASA's Open Source definitions and usage agreements.
Open Source Resources
The following is a list of relevant resources on NASA Open Source:
Developing An Open Source Option for NASA Software (PDF version 209KB)
This NAS technical report provides background material on why an Open Source option is appropriate for NASA.
NASA Space Act (NASA Charter)
The NASA charter: the agency shall "provide for the widest practicable and appropriate dissemination of information concerning its activities and the results thereof."