NBCBench - benchmarking Nonblocking MPI Collective Operations Performance

Felix, qui, quod amat, defendere fortiter audet

Home -> Research -> NB Collectives -> Performance

  Publications
  Awards
  Research
    NB Collectives
      LibNBC
      NBCBench
      Performance
        Open MPI/MVAPI
        MVAPICH
      CG Solver
      Kernels
      HPL
    MPI Topologies
    MPI Datatypes
    Netgauge
    Network Topologies
    Ethernet BTL eth
    ORCS
    DFSSSP
    Older Projects
    cDAG
    LogGOPSim
    CoMPIler
  Teaching

Miscellaneous

Full CV [pdf]

Events

Past Events

NBCBench - benchmarking Nonblocking MPI Collective Operations Performance

Description

NBCBench is a benchmark that measures overlap and asynchronous progression of nonblocking collective operations implemented in LibNBC. NBCBench is distributed under the BSD license.

Download NBCBench

NBCBench 1.0 - (158.55 kb)
NBCBench 1.1 - (282.71 kb) [supports FFTW in compute loop, no testing though]

Performance Results for different MPI Implementations

We present performance results of LibNBC for different MPI implementations. LibNBC issues MPI_Isend() and MPI_Irecv() calls, and the performance and possible overlap depends on the implementation in MPI. We do also compare the collective operations implemented in LibNBC to the MPI operations. Results are available for the following MPI Implementations:

Please keep in mind that not all collective algorithms in LibNBC are optimized!

Benchmark Methodology

We used the overlap-benchmark which has been designed to assess the maximal possible overlap and the minimal latencies. The benchmark will be described later. Details can be found in "Accurately Measuring Collective Operations at Massive Scale" [1] and "Implementation and Performance Analysis of Non-Blocking Collective Operations for MPI" [2].

References

PMEO'08	[1] Torsten Hoefler, Timo Schneider and Andrew Lumsdaine:
		Accurately Measuring Collective Operations at Massive Scale In Proceedings of the 22nd IEEE International Parallel & Distributed Processing Symposium, PMEO'08 Workshop, presented in Miami, FL, ISSN: 1530-2075, ISBN: 978-1-4244-1694-3, Apr. 2008, Invited to a journal special issue on top picks from PMEO'08.

SC07	[2] Torsten Hoefler, Andrew Lumsdaine and Wolfgang Rehm:
		Implementation and Performance Analysis of Non-Blocking Collective Operations for MPI In Proceedings of the 2007 International Conference on High Performance Computing, Networking, Storage and Analysis, SC07, presented in Reno, USA, IEEE Computer Society/ACM, Nov. 2007, (acceptance rate 20%, 54/268)


serving: 216.73.217.165:46125	© Torsten Hoefler