Research Topics in HPC
by
on May 21st, 2008 at 09:46 PM (11433 Views)
The High Performance Computing (HPC) area is undergoing quite a bit of changes that may impact how HPC applications are developed, and operated over the next few years. We here in the research group at Platform Computing are tracking a number of technology trends and trying to get a handle on how they will impact the HPC Community. To kick things off here is an initial list of possible areas:
1. Cloud Computing. The advent of services like Amazon EC2 has provided opportunities to access compute capacity on demand without having to set up physical infrastructure. While Cloud Computing represents more of change in business models, there are technology implications for enterprises which seek to build out their own internal clouds as well as leverage external clouds. Clouds present several challenges for resource allocation and management systems which must take into account price, network latency, and data transfer costs. The ability to dynamically provision nodes without the IT overhead can have an impact on how to allocate resources for services where the SLA associated with them is outside the control of a single organization.
2. Multi-Core, GPUs, FPGAs With chip speeds not getting any faster, vendors have added large amount of parallel computing on the box in the form of multiple processing cores. GPUs offer deep parallel processing pipelines that effectively create a vector processor on a card. While there is a lot of potential hardware horsepower for HPC calculations, to effectively exploit the low-level parallelism requires use of vendor-specific low-level tools. How the models for distributing workloads across boxes and optimizing within one box, can converge is an interesting area for research.
3. Parallel Programming Languages and Models. Ever heard of Haskell or Erlang or Hadoop? These are not widely used in traditional HPC today where C/C++ and Fortran are dominant. But these programming models may offer alternative approaches to expressing parallelism and distributed processing capabilities. Is there any scope to apply these models to HPC applications or is a better approach to use language extensions like OpenMP or MPI?
4. Data-Centric Computing. High-performance computing is not always just about compute-intensive number crunching. Frequently large amounts of data need to be processed as part of the computational workflow, and fast access to data allows performance of the application to be improved. There is a range of data types and requirements that have lead to the creation of different techniques such as parallel filesystems (PVFS, Lustre), in-memory data grids (Oracle Coherence, Gemstone as examples), basic data caches (memcached), or other types of data partitioning and distribution methods optimized for wide-area networks. What are the implications for how we write application and for the run-time scheduling and resource management infrastructure when data is primary resources rather than compute cycles?
5. Virtualization in HPC Server virtualization technologies are becoming part of the mainstream in enterprise data centers. However, the use of these tools in HPC has been limited to due performance overheads as well as costs associated with them. The open-source Xen hypervisor may change things by giving HPC environments the flexibility need to implement rapid workload-centric provisioning of machines- starting them up, shutting them down, or migrating them in response to the needs of various applications. With the de-coupling of OS from hardware, there is greater level of customization of the OS stack for individual job types that is possible. For example, what are the possibilities of creating job or application specific appliances that have just-enough-OS to suit the needs of that application and perhaps optimized for it?
6. Complex Event Processing (CEP). Traditionally HPC environments has focused on batch processing of large complex jobs which takes minutes or hours or longer to run. CEP technologies deal with more low latency processing of large event streams associated with for example algorithmic trading in stock markets. These environments can arguably be considered as “Edge HPC” as opposed to the traditional HPC. However, these environments share some of the same stringent requirements for high performance although expressed in different terms such as latency and throughput in terms of events per second. Some basic questions need to be answered whether CEP should be considered as part of a broader definition of HPC. Can the technologies being developed in these areas be leveraged in wider HPC? Or can the techniques developed in traditional HPC be applied in CEP environments.
This isn’t meant to be an exhaustive list or even a full explanation of all the research questions to be explored in these areas. We will expand further in some of our thinking in future blogs. We’d like to use this opportunity to stimulate further discussion and get feedback from users or other researchers. Let use this forum for discussing possibilities for collaboration to get a better handle on the implication of these trends in HPC.




Email Blog Entry
