-
July 11th, 2008 02:59 PM #1
Rapid reconfiguration of Multi-cluster resources
Originally posted by: wrjohns, Sat Oct 21, 2006 2:30 am
We are planning to enable Multi-Cluster in our Universus environment. The purpose of doing this is to enable one of a kind resources and licensed codes that are in our production environment in our development environment.
We what to be able to add or remove the visibility of resources in either direction quickly and without interrupting access. Production to development, development to production.
If anyone has a method for 'rapid' reconfiuration of Multi-cluster we would like to hear your methods.
thanks
wrjohns
-
July 11th, 2008 03:33 PM #2
Originally posted by: csmith, Tue Nov 07, 2006 7:13 pm
The multi-cluster connections are set up at two levels (well, three, but I'll ignore resource leasing for now since I don't think this is what you'll want to use in your environment):
- the "cluster" level between the master lims
- the queue level when setting up forwarding for jobs
If all you want to do is to shut off the access from one to the other (say from dev to prod), you could just change the queue definitions on the prod side that allow jobs to enter the system, do badmin reconfig, and the connection is severed. You check this connection with 'bclusters'.
If you also want to stop showing the prod resources on the dev side (regardless of whether jobs can actually flow), you can modify the lsf.shared file (and any entries in lsf.cluster.* dealing with RemoteClusters) and do lsadmin reconfig. You check this connection with 'lsclusters'. If one side is configured for MC, but the other isn't, the connection will appear to be unavailable.
Both of these changes require reconfig of the cluster ... so that's how fast it can be done.
-
July 11th, 2008 03:35 PM #3
Rapid reconfiguration of Multi-cluster resources
Originally posted by: wrjohns, Tue Nov 07, 2006 7:59 pm
Thanks for the reply.
I'm a bit surprised as the 'lease' model is precisely what we implemented. We did so because it allowed us to preserve our Universus queue structures (queue@host), which is required for the esub process to know we are using the jobstarter (thus passing LSFPARMSFILE to the remote host).
The one problem we are having now is that the qsub -m <host>, where <host> is not local but obtained from a lease, does not work. We get an unknown host error.
Suggestions?
wrjohns
-
July 11th, 2008 03:35 PM #4
Originally posted by: csmith, Tue Nov 07, 2006 8:08 pm
Well ... for changing the leasing config, it's like for queues. Just change lsb.resources and run badmin reconfig on the side exporting the nodes.
As for the qsub problem ... do you mean bsub? I don't know the answer to that one.
-
July 11th, 2008 03:36 PM #5
Originally posted by: wrjohns, Tue Nov 07, 2006 8:14 pm
Right, bsub.
Too many resource managers.
wrjohns
-
July 11th, 2008 03:38 PM #6
specifying a leased host with bsub -m
Originally posted by: ddunlap, Fri Nov 24, 2006 6:49 pm
In the MultiCluster "resource leasing" model, f you want to run a job
on a specific host, use:
bsub ... -m host@remotecluster ...
Note that if you do this, the job may pend forever if that host is
not leased.
FYI,
Dale
_________________
Dale Dunlap
Technical Consultant
-
July 11th, 2008 03:38 PM #7
Rapid reconfiguration of Multi-cluster resources
Originally posted by: wrjohns, Sat Nov 25, 2006 11:23 am
Dale,
Thanks for the response. I'm not sure how Universus (EGO?) works today with respect to multi-cluster, but a uniform namespace is a requirement by our early version. That is our Universus does not know about the 'cluster' part of host@cluster. Not that we could not modify it to track such information, but we would rather use native LSF features or configuration to obviate the need to maintain another list of hosts regardless of which cluster your running 'in' (uniform namespace).
Thanks
Wilbur
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
Forum Rules