+ Reply to Thread
Results 1 to 10 of 10

Thread: Platform Lava 6.1 - lsb.events - Parameters Dont Match

  1. #1
    CCLS_Metadata is offline Junior Member
    Join Date
    September 25th, 2008
    Posts
    11
    Downloads
    0
    Uploads
    0

    Default Platform Lava 6.1 - lsb.events - Parameters Dont Match

    Hello,

    My university's research lab is running Platform Lava 6.1:

    -bash-3.00$ lsid
    Platform Lava 6.1, May 5 2005
    Copyright 1992-2004 Platform Computing Corporation

    My cluster name is lava
    My master name is ccls
    -bash-3.00$


    on a Dell HPC Cluster (Dell PowerEdge Cluster, Intel(R) Xeon(TM) CPU 3.00GHz).

    We need to do analysis on the log file "lsb.events" to get information about the status of the submitted jobs. However, there are serious mismatches found when comparing the number of the parameters and the data types of the parameters in the log file against how they are documented in "lsb.events" man pages.

    For example:
    - JOB_NEW is documented to have 56 parameters. But JON_NEW record lines of 59 and 60 parameters are found in the log.
    - JOB_STATUS, 12. But 12 and 31 found.
    ...

    What could be the cause of this problem? And how to fix it please?

    Thanks in advance,
    Duke

  2. #2
    CCLS_Metadata is offline Junior Member
    Join Date
    September 25th, 2008
    Posts
    11
    Downloads
    0
    Uploads
    0

    Default

    AND:

    Our university don not give student researchers Lava root access. Where can we obtain a copy of Platform Lava 6.1 for experiment, please?

    Thanks,
    Duke

  3. #3
    _fmms_ is offline Member
    Join Date
    September 16th, 2008
    Location
    Germany
    Posts
    42
    Downloads
    7
    Uploads
    0

    Default

    If I parsed it correctly my lava 1.0 file has 54
    Code:
         1  "1.0"
         2  1223460002
         3  18894
         4  500
         5  33564675
         6  1
         7  1223460002
         8  0
         9  0
        10  -65535
        11  0
        12  17157
        13  "lavaadmin"
        14  -1
        15  -1
        16  -1
        17  -1
        18  -1
        19  -1
        20  -1
        21  -1
        22  -1
        23  -1
        24  -1
        25  ""
        26  100.00
        27  2
        28  "normal"
        29  ""
        30  "master52"
        31  "/tmp"
        32  "/tmp//18893"
        33  ""
        34  ""
        35  ""
        36  "/home/lavaadmin"
        37  "1223459244.18893.18894"
        38  0
        39  ""
        40  ""
        41  "sleep 1000"
        42  "sleep 1000"
        43  0
        44  ""
        45  "default"
        46  1
        47  "LINUX86"
        48  ""
        49  16
        50  0
        51  ""
        52  ""
        53  ""
        54  -1
    
    The man page specifies (is this the same for 6.1?)
    Code:
         1  Version number (%s)
         2  Event time (%d)
         3  jobId (%d)
         4  userId (%d)
         5  options (%d)
         6  numProcessors (%d)
         7  submitTime (%d)
         8  beginTime (%d)
         9  termTime (%d)
        10  sigValue (%d)
        11  chkpntPeriod (%d)
        12  restartPid (%d)
        13  userName (%s)
        14  rLimits
        15  rLimits
        16  rLimits
        17  rLimits
        18  rLimits
        19  rLimits
        20  rLimits
        21  rLimits
        22  rLimits
        23  rLimits
        24  rLimits
        25  hostSpec (%s)
        26  hostFactor (%f)
        27  umask (%d)
        28  queue (%s)
        29  resReq (%s)
        30  fromHost (%s)
        31  cwd (%s)
        32  chkpntDir (%s)
        33  inFile (%s)
        34  outFile (%s)
        35  errFile (%s)
        36  subHomeDir (%s)
        37  jobFile (%s)
        38  numAskedHosts (%d)
        39  askedHosts (%s)
        40  dependCond (%s)
        41  preExecCmd (%s)
        42  jobName (%s)
        43  command (%s)
        44  nxf (%d)
        45  xf (%s)
        46  mailUser (%s)
        47  projectName (%s)
        48  niosPort (%d)
        49  maxNumProcessors (%d)
        50  schedHostType (%s)
        51  loginShell (%s)
        52  userGroup (%s)
        53  options2 (%d)
        54  idx (%d)
        55  inFileSpool (%s)
        56  commandSpool (%s)
        57  jobSpoolDir (%s)
        58  userPriority (%d)
    
    So yeah you found a bug, I guess.
    Last edited by _fmms_; October 2nd, 2008 at 03:06 PM.

  4. #4
    CCLS_Metadata is offline Junior Member
    Join Date
    September 25th, 2008
    Posts
    11
    Downloads
    0
    Uploads
    0

    Default

    Thank you, _fmms_!

    Here is the entries for JOB_NEW in my lsb.events 6.1, 56 parameters:

    0. JOB_NEW
    1. Version number (%s)
    2. Event time (%d)
    3. jobId (%d)
    4. userId (%d)
    5. options (%d)
    6. numProcessors (%d)
    7. submitTime (%d)
    8. beginTime (%d)
    9. termTime (%d)
    10. sigValue (%d)
    11. chkpntPeriod (%d)
    12. restartPid (%d)
    13. userName (%s)
    14. rLimits
    15. rLimits
    16. rLimits
    17. rLimits
    18. rLimits
    19. rLimits
    20. rLimits
    21. rLimits
    22. rLimits
    23. rLimits
    24. rLimits
    25. hostSpec (%s)
    26. hostFactor (%f)
    27. umask (%d)
    28. queue (%s)
    29. resReq (%s)
    30. fromHost (%s)
    31. cwd (%s)
    32. chkpntDir (%s)
    33. inFile (%s)
    34. outFile (%s)
    35. errFile (%s)
    36. subHomeDir (%s)
    37. jobFile (%s)
    38. numAskedHosts (%d)
    39. askedHosts (%s)
    40. dependCond (%s)
    41. preExecCmd (%s)
    42. timeEvent (%d)
    43. jobName (%s)
    44. command (%s)
    45. nxf (%d)
    46. xf (%s)
    47. mailUser (%s)
    48. projectName (%s)
    49. niosPort (%d)
    50. maxNumProcessors (%d)
    51. schedHostType (%s)
    52. loginShell (%s)
    53. exceptList (%s)
    54. options2 (%d)
    55. userPriority (%d)
    56. extsched (%s)

    Regards,
    Duke

  5. #5
    qlnie is offline LSF Moderator
    Join Date
    June 24th, 2008
    Posts
    2
    Downloads
    7
    Uploads
    0

    Default

    Quote Originally Posted by _fmms_ View Post
    If I parsed it correctly my lava 1.0 file has 54
    The man page specifies (is this the same for 6.1?)
    Code:
         1  Version number (%s)
         2  Event time (%d)
         3  jobId (%d)
         4  userId (%d)
         5  options (%d)
         6  numProcessors (%d)
         7  submitTime (%d)
         8  beginTime (%d)
         9  termTime (%d)
        10  sigValue (%d)
        11  chkpntPeriod (%d)
        12  restartPid (%d)
        13  userName (%s)
        14  rLimits
        15  rLimits
        16  rLimits
        17  rLimits
        18  rLimits
        19  rLimits
        20  rLimits
        21  rLimits
        22  rLimits
        23  rLimits
        24  rLimits
        25  hostSpec (%s)
        26  hostFactor (%f)
        27  umask (%d)
        28  queue (%s)
        29  resReq (%s)
        30  fromHost (%s)
        31  cwd (%s)
        32  chkpntDir (%s)
        33  inFile (%s)
        34  outFile (%s)
        35  errFile (%s)
        36  subHomeDir (%s)
        37  jobFile (%s)
        38  numAskedHosts (%d)
        39  askedHosts (%s)
        40  dependCond (%s)
        41  preExecCmd (%s)
        42  jobName (%s)
        43  command (%s)
        44  nxf (%d)
        45  xf (%s)
        46  mailUser (%s)
        47  projectName (%s)
        48  niosPort (%d)
        49  maxNumProcessors (%d)
        50  schedHostType (%s)
        51  loginShell (%s)
        52  userGroup (%s)
        53  options2 (%d)
        54  idx (%d)
        55  inFileSpool (%s)
        56  commandSpool (%s)
        57  jobSpoolDir (%s)
        58  userPriority (%d)
    
    So yeah you found a bug, I guess.
    For LAVA 1.0, all the parameters documented in man page of lsb.events will be logged in the file except the "52 userGroup (%s)". This may be a problem.

    If you want to parse the events information, you should be clear on that some parameters documented in man page do not be there always. They only be logged with the conditions on.
    For example:
    "39 askedHosts (%s)" will be there only when "38 numAskedHosts (%d)" is more than 0.
    It is the same as " xf (%s)". It will be logged when "nxf (%d)" is more than 0.
    The other one is "niosPort (%d)".

    And you should be awared that all the things in the quotaions (a string) are considered one item.

    BTW, the function which writes the JOBNEW log to events file is writeJobNew() in lsb.log.c, and you can find the other functions on events file in lsb.log.c. I think the code will give you more help.

    I think lava 1.0 is similar with Platform lava6.1

  6. #6
    _fmms_ is offline Member
    Join Date
    September 16th, 2008
    Location
    Germany
    Posts
    42
    Downloads
    7
    Uploads
    0

    Default

    Thanks alot qlnie, that is not too much fun to parse... And as far as I can see this pice of information is missing from the manpage.

    Code:
           nxf (%d)
    
                  Number of files to transfer (%d)
    
           xf (%s)
    
                  List of file transfer specifications
    

  7. #7
    CCLS_Metadata is offline Junior Member
    Join Date
    September 25th, 2008
    Posts
    11
    Downloads
    0
    Uploads
    0

    Default

    Thank you very much, qlnie and _fmms_!

    We have tried padding the missing parameters. However, the log entries seem to be strange for not having fewer but more parameters.

    Here are two samples of JOB_NEW and JOB_STATUS:
    http://ducta.net/sfsu/csc899/doc/lsb...1_sample01.pdf

    We would appreciate if you will take a look!

    And where is "lsb.log.c" located, please?

    Thanks,
    Duke

  8. #8
    _fmms_ is offline Member
    Join Date
    September 16th, 2008
    Location
    Germany
    Posts
    42
    Downloads
    7
    Uploads
    0

  9. #9
    CCLS_Metadata is offline Junior Member
    Join Date
    September 25th, 2008
    Posts
    11
    Downloads
    0
    Uploads
    0

    Default

    Thanks, _fmms_!

    And please excuse us for having so many questions. We are new to the application.

    1. Where can we find a demo of Platform Lava 6.1 GUI?

    - We found this site but had no access: http://teracluster.icss.neu.edu:8080/Platform/
    - They also some good interfaces listed on: Teracluster Cluster

    2. Our group is investigating the logs to collect information. However, they seem to be unstable. Thus, we are looking for a different approach. Is it possible to make Platform Lava write logs directly to a database so that records will be stored more properly?

    3. If we are to alter the programs like ""lsb.log.c", will that affect other parts of the Platform Lava rather the "lsb.events" logs?

    Thanks,
    Duke

  10. #10
    _fmms_ is offline Member
    Join Date
    September 16th, 2008
    Location
    Germany
    Posts
    42
    Downloads
    7
    Uploads
    0

    Default

    1.
    I do not know where to find a demo but you may download it with http://my.platform.com/products/plat..._64.disk1.iso/

    2.
    I read LSF does logging into a database, but lava is not able to. I did not understand yet how to use the logs, there are some information going into files and others printed to stderr when setting some DEBUG variables in lsf.conf. There are other messages in the source where I could never trace where they and up... (Checkpointing/Resume debug messages)

    3.
    no idea.

+ Reply to Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts