+ Reply to Thread
Results 1 to 3 of 3

Thread: Splitting the CPU on a Node using LSB_HOSTS and LSB_MCPU_HOSTS to run multiple MPI jo

  1. #1
    daved10928 is offline Junior Member
    Join Date
    November 28th, 2008
    Posts
    2
    Downloads
    0
    Uploads
    0

    Default Splitting the CPU on a Node using LSB_HOSTS and LSB_MCPU_HOSTS to run multiple MPI jo

    Hi Folks,
    I am using a system that requires me to occupy a full IBM Power node that
    has 32 CPU. Given the parallel scaling of my codes the most efficient way
    for me to use this number of CPU is to run multiple, separate MPI jobs with a
    subset of the CPU. In this particular case. It is my understanding that it is possible
    for the user to control which of the allocated CPU are assigned to a particular
    MPI job. I have been told that this is done using the environmental variables:
    LSB_HOSTS and LSB_MCPU_HOSTS


    A simple schematic example of what I want to do would be as follows
    for a simple case having 32 CPU and each job taking a part of those.

    Gather information about mpi processors available

    Launch first MPI job putting it into the background:

    mpirun.lsf (needed code to get only some CPU) $executable >& Modeloutput &

    Launch second MPI job in foreground

    mpirun.lsf (needed code to get rest of CPU) $executable2 >& Modeloutput2

    wait

    The 2 executables are for different codes that communicate with each other infrequently via files.


    The wait command assures both jobs are finished before exiting the script.


    Any help in constructing such a script would be great appreciated.

    Dave

  2. #2
    gthomas is offline Junior Member
    Join Date
    February 29th, 2008
    Posts
    14
    Downloads
    2
    Uploads
    0

    Default

    Hi,

    Will this work?

    bsub -n 1,X mpirun.lsf $executable >& Modeloutput &
    bsub -w 'started(Job ID of first job) -n 1,32-X mpirun.lsf $executable2 >& Modeloutput2 &

    Where X is the number of CPUS u want to assign to the first job.
    Speeding does not kill. Staying stationary does.

  3. #3
    daved10928 is offline Junior Member
    Join Date
    November 28th, 2008
    Posts
    2
    Downloads
    0
    Uploads
    0

    Default

    Hi,
    Thanks for the suggestion. Unfortunately not because of the way the
    queues on the machine are set up. I have to submit a single job script
    for the whole node (32CPU). What I need to do is divide up the CPU
    once LSF has given me a node to work on.

    Thanks again for your suggestion.

+ Reply to Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts