Running AMUSE scripts on the LGM cluster

AMUSE can be installed and run on the LGM cluster at the Leiden University (LIACS).

The prerequisite software is not available on the cluster, so this needs to be installed first:

  1. Check-out amuse from svn:
    > svn co http://www.amusecode.org/svn/trunk amuse-svn
    > cd amuse-svn/doc/install
    
  2. Setup the environment variables PREFIX, PATH and LD_LIBRARY_PATH according to Install AMUSE prerequisites.
  3. Install python following the instructions from the same document as step 2
  4. Install the other prerequisites:
    > export FC=gfortran
    > export F77=gfortran
    > ./install.py install --hydra
    ...
    

After the installation of the prerequisites finishes you can configure and buils AMUSE

> cd amuse-svn
> ./configure
> make

When you build the codes with "make", you may get a wget error when it tries to download setuptools. You can solve this by first executing:

> python ez_setup.py --insecure

You can run the tests with:

> mpiexec nosetests -v

Configure a cluster

To run on multiple nodes, you need to setup mpdboot first

  1. create a file with the name of the hosts you want to run on
    >cat hosts
    node08  ifhn=ibnode08
    node09  ifhn=ibnode09
    node10  ifhn=ibnode10
    node11  ifhn=ibnode11
    
    Every node has two names: nodeXX and ibnodeXX. The nodeXX address will use the ethernet interface, while the ibnodeXX will use the ethernet over infiniband address. The infiband address is much faster than the ethernet address. You also need to add the node you are running from to the lists of hosts
  2. start mpds on each node mpdboot --totalnum=3 --file=hosts
  3. start your script mpiexec amuse.sh yourscript.py (will probably run on ibnode08 , 09, 10 and use the infiniband)
  4. stop you cluster mpdallexit

Check cluster speed

To check if you are using the infiniband ports, you can run this test:

cat > ./test.c <<DELIM
#include <mpi.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>

int main(int argc, char ** argv)
{
        double * buffer = 0;
        double t0, t1;
        int number_of_doubles = 20000;
        MPI_Init(&argc, &argv);
        if(argc > 1)
        {
                number_of_doubles = atoi(argv[1]);
        }
        printf("broadcasting %d doubles\n", number_of_doubles);
        buffer = (double *) malloc(sizeof(double) * number_of_doubles);
        t0 = MPI_Wtime();
        MPI_Bcast(buffer, number_of_doubles , MPI_DOUBLE, 0, MPI_COMM_WORLD);
        t1 = MPI_Wtime();
        printf("broadcasting was %f seconds, %f bytes per second, %f mb per second \n", t1 - t0, number_of_doubles / (t1 - t0), (number_of_doubles/(1024.0*1024.0*(t1-t0))));
        MPI_Finalize();
        return 0;
}
DELIM
mpicc test.c
mpiexec -n 2 ./a.out 10000000

todo

The LGM cluster has two mpi implementations already installed, however these do not work with amuse yet.

  • mvapich is not MPI 2.2 compatible
  • openmpi does not have mpicc etc installed

Once these packages hey are compatible. you need to (These instructions asume a version of AMUSE later then June 09 2011):

  1. This extra step is needed if you want to use the system installed MPI implementation on the LGM cluster: select the mpi implementation to use with mpi-selector. Only openmpi is mpi2 compatable (needed by AMUSE):
       > mpi-selector --list 
       mvapich-1.2.0-gcc-x86_64
       openmpi-1.4-gcc-x86_64
       > mpi-selector --set openmpi-1.4-gcc-x86_64
       > mpi-selector --query
       > mpicc -show
      
    
  2. remove the installed prerequisites (ensure $PREFIX is set!!!!!)
    > rm -Rf $PREFIX
    
  3. Install the amuse prerequisites without mpich2
    > export FC=gfortran
    > export F77=gfortran
    > ./install.py install no-mpich2
    ...