+ Reply to Thread
Results 1 to 4 of 4

Thread: Removing Nodes from Queue

  1. #1
    admin_lsf is offline Member
    Join Date
    July 11th, 2008
    Posts
    57
    Downloads
    0
    Uploads
    0

    Default Removing Nodes from Queue

    Originally posted by: JPummill, Tue Mar 13, 2007 1:22 pm

    I have just removed a number of compute nodes from one of our clusters for an extended experiment and now I need to remove the nodes from LSF on the master node so that they no longer show up as unavailable resources. Using LSF 6.2. What file(s) do I need to edit?

    Thanks,

    Jeff

  2. #2
    admin_lsf is offline Member
    Join Date
    July 11th, 2008
    Posts
    57
    Downloads
    0
    Uploads
    0

    Default

    Originally posted by: BillD, Tue Mar 13, 2007 6:36 pm

    Jeff,

    The "lsf.cluster.<clustername>" file should contain the list of all nodes in the cluster. You should remove the nodes from the list and make sure all LSF daemons are stopped on those nodes. You should then issue "lsadmin limrestart all" and "badmin reconfig". You should probably also make sure that LSF daemons are no longer started when these systems boot.

    If you have referenced any of these nodes in various configuration files, you'll get a lot of warning messages. You should remove these references.

  3. #3
    admin_lsf is offline Member
    Join Date
    July 11th, 2008
    Posts
    57
    Downloads
    0
    Uploads
    0

    Default

    Originally posted by: JPummill, Wed Mar 14, 2007 1:03 pm

    Morning Bill,

    Edited the lsf.cluster.prospero file to remove all of the instances of the compute nodes that I removed. When I ran "lsadmin limrestart all", it still showed the nodes in the list as was the case when I ran badmin reconfig. No errors listed during either command.

    In the /share/apps/LSF/conf directory, there are also the following files...

    hosts
    hostfile
    lsf.conf
    lsf.shared

    ...and a few others.

    Using lsload, all of the nodes that I removed from the lsf.cluster.prospero file are still shown.

    Where could it still be reading this info?

    -Jeff

  4. #4
    admin_lsf is offline Member
    Join Date
    July 11th, 2008
    Posts
    57
    Downloads
    0
    Uploads
    0

    Default

    Originally posted by: JPummill, Wed Mar 14, 2007 5:48 pm

    Hey Bill,

    service lsf stop, then service lsf start did the trick. Cluster was idle, so nothing had any chance of being compromised or interrupted.

    Thanks,

    Jeff

+ Reply to Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts