Originally posted by: garym, Wed Nov 29, 2006 8:01 pm
I have a test queue defined that is pre-emptive over all lower priority queues. We have added limits so that only one test 'job' can be run by a user at any one time (where job may be more than one cpu).
I have defined a resource testcount under lsf.cluster.name:
and placed a limit on that resource under lsb.resources:Code:RESOURCENAME LOCATION testcount (0@[all]) End ResourceMap
On the test queue, I have:Code:Begin Limit NAME = testcount_limit PER_USER = all PER_QUEUE=test RESOURCE = [testcount,1] End Limit
Code:RES_REQ = rusage[testcount=1]
The problem: A user will submit more than one job to the test queue which will suspend other jobs in the cluster, but the test job will pend with the reason:
Code:Resource (testcount) limit defined on queue has been reached;
Jobs should not be suspended until the next test queue job is able to run. What am I missing?


LinkBack URL
About LinkBacks
Reply With Quote