 |
|
 |
|
|
|
|
|

|
|
NAS PARALLEL BENCHMARK CHANGES
Modifications are listed in order of most recent versions.
NPB 3.3
- New and Improvements
-
The Class E problem has been introduced in seven of the benchmarks
(BT, SP, LU, CG, MG, FT, and EP) to stress larger size parallel
computers.
-
The Class D problem has been added to the IS benchmark in all
three implementations. It requires the compiler support of
64-bit "long" type in C. The MPI version of IS now allows runs
up to 1024 processes.
-
The Bucket Sort option (USE_BUCKETS) has been added to
the OpenMP version of IS and made as the default.
-
Introduction of the "twiddle" array in the OpenMP FT benchmark
to improve performance.
-
Merge the vector codes for the BT and LU benchmarks into this
release.
-
Updates to BTIO (MPI/BT with IO subtypes):
- Added I/O stats (I/O timing, data size written, I/O data rate).
-
Added an option for interleaving reads between writes through
the inputbt.data file. Although the data file size would be
smaller as a result, the total amount of data written is still
the same.
- Bug Fixes
-
Fixed a verification failure in MPI/FT for cases where NX/=NY
and the 2D decomposition are used.
-
Fixed an output printing format problem in MPI/FT occurred
when the number of processes >= 1000.
-
Fixed a performance regression in MPI/SP due to improper
padding of array dimensions.
-
Fixed invalid access of zero-length array elements in DC,
made the benchmark output consistent with other NPBs.
- Fixed a data race in OMP/UA.
-
Fixed missing variable declarations for the Bucket sort option
in SER/IS.
- Fixed use of an uninitialized variable in MPI/sys/setparams.c.
- Others
-
The default value for collbuf_nodes in the BT I/O benchmark
is now set to 0, indicating no file hints will be used.
The setting can be changed by using the "inputbt.data" file.
-
The hyperplane version of LU (LU-HP) is no longer included
in the distribution. Download NPB3.2.1 if needed.
NPB 3.3-MZ
- Inclusion of the nested OpenMP implementation (NPB3.3-MZ-OMP) to replace the SMP+OpenMP implementation.
- Merge vector codes into the standard distribution. The vector codes can be selected with the VERSION=VEC option during compilation.
- Performance improvement to SP-MZ.
NPB 3.2.1
This is a bug fixing release of NPB 3.2.
- MPI Version Changes
- Removed a duplicated statement for writing FT parameters.
- Included the I/O verification for BT SUBTYPE=EPIO.
- Fixed wrong data type used for communicating integers in LU.
- Fixed an incorrect calculation of parameter "nr" for MG, which
caused run-time failure for NPROCS >= 512.
- Fixed a problem of mismatching message tags in MG.
- OpenMP Version Changes
- Use THREADPRIVATE for working array storage "in EP".
- Fixed a data flush bug for pipeline in LU.
- Reduced stack space usage in IS.
- Use locks for atomic updates in UA.
NPB 3.2-MZ
- Introduced two new classes of problem: E and F.
- Added "SCHEDULE(STATIC)" to the OpenMP layer of hybrid versions.
- Minor optimization to the communication buffers in MPI versions.
- Fixed a data flush bug for pipeline in LU-MZ.
NPB 3.2
- DC version in NPB 3.2-SER was converted to C from C++ (Classes S, W, A, B). sys/setparams.c file was changed appropriately.
- OpenMP version of DC was added to NPB3.2-OMP.
- Data Traffic benchmark DT was added to NPB3.2-MPI.
GridNPB 3.1
- Introducing Globus version of computational-grid benchmarks
- Grid NPB 3.0 release (11.19.02)
GridNPB 3.0
- Introducing computational-grid benchmarks
NPB 3.1
General:
- Used relative instead of absolute error for MG and CG verification.
- Made improvements and fixed bug in MG:
- Redefined Class W: {(64x64x64),40 iters} -> {(128x128x128), 4 iters}.
- Fixed incorrect verification values for Classes A and C.
- Added dummy iteration before time step loop in LU for consistency with other benchmarks.
- Fixed race in 'make suite'.
- Fixed dependence on make.def for files in subdirectory 'common'.
- Optimized BT memory usage.
- Merged NPB2.4-MPI into NPB3.1.
NPB 3.1-MPI
- Fixed bug in CG for running on large numbers of processes.
NPB 3.1-SER and NPB 3.1-OpenMP
- Added Unstructured Adaptive (UA) benchmark, measuring the effect of irregular, continually changing memory accesses.
- Added serial version of Data Cube (DC) benchmark, based on a data mining application.
- Added Class D.
- Improved LU and LU-HP: reduced memory use for 'tv' variable, inproved memory access for variables 'a,b,c,d' in LU-HP.
- Cleaned up matrix initialization (makea) and reduced memory use.
- Added 'LU-HP' as a valid benchmark option in makefiles.
NPB 3.1-OpenMP Only
- Included hyper-plane version of LU benchmark: LU-HP.
- Removed need for dummy 'omp_lib_dum' library for compilation without an OpenMP compiler.
- Parallelized initialization part of MG.
- Parallelized matrix initialization (makea) in CG.
- Cleaned up SP to make structure more consistent with serial version.
NPB 3.1 - MZ
- Allow variable number of threads for individual processes and define processor group for each process. Versions affected: all parallel versions
- Report total number of threads (instead of threads-per-process). Versions affected: all parallel versions
- Use accurate surface term in MFLOPS calculation to take into account non-square zone faces. Versions affected: all
- Improvements for LU-MZ:
- Added one SSOR iteraction before timing loop to touch all data and code.
- Improved the memory usage for array 'tv.'
- Made rhs and erhs more cache-friendly.
Versions affected: all
- Print built-in timers for all processes Versions affected: all parallel versions
NPB 2.4
- Added BT I/O benchmark.
- Added Class D for all benchmarks, except IS.
- Changed initialization in FT to avoid integer overflow in Class D.
- Reduced FT memory size by removing the "ex" table used in the "evolve" routine.
NPB 2.3
- Added CG.
- Defined Class W for small memory systems (under 32MB).
- Released serial versions consistent with parallel code (08/18/97).
NPB 2.2
- Added IS in C.
- Modified FT so that it can be run on a number of processors larger than the last array dimension.
- Added EP.
- Encoded CLASS C size and verification numbers.
- Removed include file mpifrag.f containing executable statements.
- Added different versions of random number generators.
|
|
|
|
|