1.5 What can I run on a Beowulf Cluster?
People use Beowulf Clusters to solve computational intensive tasks. If your application is now taking 1 minute to complete a computation and you want to make it complete in 30 seconds – Beowulf Clusters are unlikely to be what you are looking for. You would be better off upgrading to the latest and fastest CPU, add more RAM, use a faster hard-disk.
Beowulf Clusters are used typically used to solve tasks that take several hours to days and even weeks to complete. These include :
- Computational fluid dynamics for aerospace
- Financial market modeling (eg. with Monte Carlo simulations)
- Genomics Sequencing
- Computational Chemistry
- Seismic Analysis
- Digital Media Rendering
- Electronic & Design Applications
Today, 70% – 80% of supercomputing tasks are run on Beowulf Clusters. Good problems to run on Beowulf Clusters falls into the so called coarse and medium grain problem definition.
In
coarse grain problems, the problem can be split up such that the computation time is very much greater than the communications time, in such instances, the individual pieces of code running on each node seldom communicates with each other to exchange or synchronize data during the run.
In
medium grain problems, the communications overheads begins to become noticeable.
In
fine grain problems the communications overheads is nearly equivalent to the computation time, and hence it would not be generally suitable for Beowulf Clusters. The fine grain problems are reserved for the traditional large scale SMP/NUMA type supercomputers with its higher bandwidth and lower latencies. However, many researchers are today trying to run fine grain problems on Beowulf Clusters with networks such as Quadrics, Myrinet and Infiniband which provides the required high-speed, high-bandwidth and low-latency required for such jobs.
Increasing more and more commercial applications are supporting the use of beowulf parallel/clustering technologies so that their customers can use lower costs x86/x86-64 hardware instead of expensive SMP systems. This is especially true in the engineering market where ISV softwares such as Synopsys, Cadence, Fluent, AMI etc are now Beowulf Cluster enabled.
What about my non-parallelized serial program? Can it run on my Beowulf Cluster? The answer is "It depends."
Many users are using commercial software such as Matlab, to run non-parallel (Beowulf style) large Monte Carlo type simulations on Beowulf Clusters. In this scenario, the application is design to read off a data input file or a random number to generate the required simulations. The jobs generated are submitted to the cluster through the cluster scheduler for running over all the cluster resources.
In Lifescience, BLAST – a popular sequencing program is non-parallelized (there exist a MPI-BLAST however), and researchers are BLASTing thousands of sequences on a Beowulf Cluster. In this case, thousands of sequences are sent to a Beowulf Cluster running BLAST through the cluster scheduler, and the sequence is matched by individual nodes running BLAST with a local database.
In the digital media market, the use of Beowulf Clusters as render-farms are well-documented. A one-hour animated film will have around 6.5 million frames (assuming 30 fps). In this setup, the frames of a scene are sent to the Beowulf Cluster to be rendered, usually one to two frame per node depending on the performance of the node. There is no parallelization involved most of the time.
The above scenarios are known as High Throughput Computing, where non-parallelized jobs are running concurrently at the same time in a Beowulf Cluster.
1.6 Growth of Beowulf Clusters
Several factors are leading to the exponential use of Beowulf Clusters in many organizations:
- Increasing performance of commodity processors from INTEL and AMD
- Affordable high-speed networking such as Gigabit ethernet
- Linux, a stable, robust, open source and low-cost operating system
- Increasing use of GNU and open source softwares which allows for increased collaboration amongst researchers
- Easy availability of beowulf cluster softwares
- Large community of technical users and support groups
Figure 1-4 shows the growth of x86/x86-64 based Linux clusters on Top500 (
www.top500.org), a listing of the top 500 fastest supercomputers in the world.
1.7 History of Building Beowulf Clusters
When the very first Beowulf Clusters were built in the late 1990s, it was mostly by hand, where each node needed to be individually installed and configured. Experienced Linux systems administrators often had to code their own shell scripts to make the installation and configuration of the cluster easier.
When Red Hat released their kickstart mechanism where after installing a node, its configuration could be saved to a floppy and then used to kick start the installation of a similar node, it made building clusters even easier. Cluster administrators however still needed scripts to make specific cluster configuration changes and settings. These scripts often made use of rsh or ssh to propagate the changes.
VA Linux (at that time) also released their system-imager software which allows the administrator to easily clone a compute node and easily create similar cloned compute nodes over a network.
These scripts and programs were tunned and improved over the years and became quite powerful. Some of the better knowns ones include FAI, System Imager, Clustermatic, OSCAR, C3 Tools, etc.
However, the community still lacked a robust and scalable way to easily create and manage a large cluster. Some of the disadvantage of the above tools were:
- Depends on a highly skilled Linux system administrator (a cluster builder or user often is NOT a Linux guru – he only wants to use the cluster for this research)
- Error prone as configuration changes were difficult to track
- Version skew issues like different glibc and even kernel versions as the cluster grows and new machines gets added into the cluster or software gets installed and patched, leading to inconsistent cluster image
- Difficult to integrate heterogeneous hardware as methods such as system imaging requires a golden client, and multiple golden clients are hard to keep track and keep updated.
Next generation cluster tools providing out of the box simplified configuration, reasonable defaults, complete HPC stack includes the popular San Diego Supercomputer Center (SDSC) Rocks (TM) and Kusu.
Key ideas and features of these second generation cluster tools are:
- Pseudo single image cluster system management
- Tightly integrated cluster management and 3rd party application stack
- Automated cluster installation with full OS on each node to avoid version skew
- Highly scalable and consistent methodology to build-out a cluster correctly every time
- Support for heterogenous hardware due to the use of underlying OS installation mechanism rather than golden-images
Third generation tools are being developed now which will bring about further enhancements such as:
- Integrated robust parallel cluster file system
- Userland checkpoint and restart mechanism
- Integration and use of virtual machines for certain HPC jobs
- True single image system