+ Reply to Thread
Results 1 to 4 of 4

Thread: Basic questions about install

  1. #1
    jeffj is offline Junior Member
    Join Date
    December 26th, 2009
    Posts
    5
    Downloads
    1
    Uploads
    0

    Default Basic questions about install

    Hi,

    I'm about to start an install on a small cluster. CentOS 5.4 based. Head node Tyan S2895 and compute nodes Tyan S2912 and S2895.

    1) Is the install process different for head node and compute node? If so, how?
    2) Is it OK to have different mainboards for head and compute?
    3) Is MPI handled through the provisioned NIC? Is it possible to have a separate MPI network and provision network instead of the provision and public topology?
    4) If I decided on diskless how would the process differ? Especially considering the head node would be different than the compute nodes. How do I specify an image for the diskless machines?

    I looked at the quick guide and the install manual listed. I didn't see clarification on these. If you know of where the answers are please let me know.

    jeff
    Last edited by jeffj; December 26th, 2009 at 03:20 PM. Reason: Added MPI question.

  2. #2
    Uncas is offline Junior Member
    Join Date
    April 18th, 2008
    Posts
    9
    Downloads
    2
    Uploads
    0

    Default

    Quote Originally Posted by jeffj View Post
    Hi,

    I'm about to start an install on a small cluster. CentOS 5.4 based. Head node Tyan S2895 and compute nodes Tyan S2912 and S2895.

    1) Is the install process different for head node and compute node? If so, how?
    2) Is it OK to have different mainboards for head and compute?
    3) Is MPI handled through the provisioned NIC? Is it possible to have a separate MPI network and provision network instead of the provision and public topology?
    4) If I decided on diskless how would the process differ? Especially considering the head node would be different than the compute nodes. How do I specify an image for the diskless machines?

    I looked at the quick guide and the install manual listed. I didn't see clarification on these. If you know of where the answers are please let me know.

    jeff
    Hi Jeffj,

    Welcome to hpccommunity! I think I can help you with your questions about Kusu. Please report back when you've had a chance to Install Kusu. I'd be curious to hear your feedback.

    You write:
    1) Is the install process different for head node and compute node? If so, how?
    Yes. Installing the head node (Installer Node) requires you to answer about 12 questions in the installation Wizzard. Questions are fairly straightforward and easy to answer. You must know your public and private network/subnet information. Installing the compute nodes is completely automated. You simply run *addhost* command on the Installer node and then PXE boot your compute nodes. They will pull the OS and all additional packages (Kits) via network and install. You do not need to insert a DVD/CD individually in every compute node. Kusu does this work for you!

    2) Is it OK to have different mainboards for head and compute?
    Yes. Kusu doesn't care about this. As long as the operating system CentOS 5.4 supports your hardware, Kusu will work.

    ) Is MPI handled through the provisioned NIC? Is it possible to have a separate MPI network and provision network instead of the provision and public topology?
    Default installation of Kusu sets up one(1) public network and one(1) provisioning network. However, you can setup as many additional networks in Kusu, as you need. For example, use *netedit* tool to create new network for MPI communication, and then associate the new network with a node group using *ngedit*.

    4) If I decided on diskless how would the process differ? Especially considering the head node would be different than the compute nodes. How do I specify an image for the diskless machines?
    Diskless provisioning should work in the same way as package-based provisioning. The image for diskless nodes is created for the first time during the installation of Installer node. Afterwards, the image is re-generated whenever you make changes to the diskless node group using ngedit. Be aware that diskless provisioning is not as robust as package based. Due to size constraits of the image, you may be missing NIC or DISK drivers in the initrd for diskless node group. If this does happen, you can add the appropriate drivers using *buildinitrd* tool.

    I hope this is helpful. Come back if you still have questions. Best luck!

    Uncas

  3. #3
    jeffj is offline Junior Member
    Join Date
    December 26th, 2009
    Posts
    5
    Downloads
    1
    Uploads
    0

    Default

    Hi Uncas,

    Thanks for the response. I got it to work I believe. I haven't run an app on it yet. I have to figure out something in Perl or Matlab to try it out.

    My install took some time because I was trying to install multi-boot with an ntfs OS. Once I decided to use a clean drive the install went as planned.

    I tried diskless compute nodes which did nothing. Addhost just sat there. My node had its nic boot agent running but was never seen, dhcp was waiting. I remembered seeing that dhcpd had failed when the install node booted. This seemed to be caused by making the public nic static. I'm not sure. Once I reinstalled selecting public nic as dhcp the compute node was recognized. I was able to login to the compute node. The hardware is similar so that may have helped.

    So I have a cluster. One install and one compute node. Now I have to install kits and try to figure out how to submit jobs.

    Thanks again Uncas! Any next steps? What are the best kits? I would like something that will list the percent of workload on each processor. Maybe a basic app to test trial...even a perl script.

  4. #4
    Uncas is offline Junior Member
    Join Date
    April 18th, 2008
    Posts
    9
    Downloads
    2
    Uploads
    0

    Default Next Steps

    Hey Jeffj,

    Try the Lava kit for basic job scheduler and workload management/monitoring.

    Nagios and Ganglia are great applications for cluster monitoring and reporting.

    Ntop is a great network monitoring tool if you're into that.

    Cheers,
    Uncas

    Quote Originally Posted by jeffj View Post
    Hi Uncas,

    Thanks for the response. I got it to work I believe. I haven't run an app on it yet. I have to figure out something in Perl or Matlab to try it out.

    My install took some time because I was trying to install multi-boot with an ntfs OS. Once I decided to use a clean drive the install went as planned.

    I tried diskless compute nodes which did nothing. Addhost just sat there. My node had its nic boot agent running but was never seen, dhcp was waiting. I remembered seeing that dhcpd had failed when the install node booted. This seemed to be caused by making the public nic static. I'm not sure. Once I reinstalled selecting public nic as dhcp the compute node was recognized. I was able to login to the compute node. The hardware is similar so that may have helped.

    So I have a cluster. One install and one compute node. Now I have to install kits and try to figure out how to submit jobs.

    Thanks again Uncas! Any next steps? What are the best kits? I would like something that will list the percent of workload on each processor. Maybe a basic app to test trial...even a perl script.

+ Reply to Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts