1.1 History
The first Beowulf Cluster was built in 1994 by NASA HPCC Program Earth and Space Sciences Project at the Goddard Space Flight Center. The team was lead by Dr Thomas Sterling and Donald Becker, and they assembled together a 16-processor system with 66Mhz Intel 486 CPUs running Linux over an 10Mbps Ethernet network. This was quickly followed by 100Mhz DX4 systems and Pentium class machines. By end of 1996, the target of 1Gflops sustained performance was achieved and won the team the Gordon Bell Prize in 1997 for price/performance.
The incredible price/performance achieve with Beowulf Clusters and Linux quickly caught the eye of many researchers and scientist and soon became a necessary tool for high performance computing (HPC) simulations. On the Top 500 list of worlds fastest supercomputers (
www.top500.org), Beowulf Clusters is growing exponentially and today occupies more than 70% of the list.
Today, Beowulf Clusters are reaching into the enterprise and use extensively by firms in various industries such as : oil and gas, digital media, life-sciences, and automotive, aerospace, engineering and finance.
1.2 Parallel Computing Architectures
Parallel computer architectures are typically classified based on the following :
• Central or distributed memory
• Shared or individual address space
Table 1-1: Categorization of parallel computing architecture
Beowulf Clusters falls into the category of Massive Parallel Processors (MPP) class of parallel computing. Systems in this category consists of not only Beowulf Clusters but also clusters built from IBM pServers (formerly RS/6000) and SUN smp-clusters. Typically they are:
• Systems of nodes connected by a network
• Each node has its own operating system, memory, cache and hard-disks
• Local memory and cache are not coherent across the system
Shared Address Space supercomputers refers to Symmetric MultiProcessor (SMP) systems (eg. SUN E15K) and Non-Uniform Memory Access (NUMA) systems (eg. SGI Origin, SGI Altix). Typically they are:
• Single operating system running across all the nodes (or system boards)
• The system of nodes that share common resources such as the operating system, memory, cache and hard-disks
• Local memory cache is coherent

Figure 1-1: Beowulf Clusters vs SMP
These SMP and NUMA supercomputers are typically an order more expensive than Beowulf Clusters, and serve a specific niche area of supercomputing where a Beowulf Cluster may be unsuitable. This is discussed in the next section.