View RSS Feed

Bearcat

Exploring HPC Programming: Where to start

Rate this Entry
by on August 11th, 2008 at 12:40 PM (2344 Views)
One of the topics I want to cover here, is HPC programming. That includes many things, so I want to look at such things as threading, toolkits such as openMP, graphics processor (GPU) toolkits, and cluster kits, such as MPI, as well as others that crop up from time to time. Learning how to use these toolkits can range from simple to complex. Getting the most out of each toolkit is an exercise for the reader. I will cover some basics about using each one, and run some performance tests to compare the different toolkits. This will more or less be an introduction, but I'm hoping that it will fuel some ideas in whomever reads this, and be able to apply some of the simple concepts to your own programs.

Where to start is always the hard question. I'm going to start with a compute intensive program, that does Option Pricing. Lots of floating point calculations, lots of experiments to see what the best price might be over the years, and these calculations need to be run over many option positions. There are differences in this program and real option pricing programs. This program processes a number of experiments, and the experiments are set using a random number generation, with some approximation calculations. Real option pricing programs, I believe, will use a Stochastic distribution calculation to seed the experiments, so my program is not going to be accurate. That means you should not use this program to try to price any options (that's the warning). This program is for testing and performance testing only.

The program performs 1 million experiments (option price calculations), and also performs this 1024 times, to simulate a portfolio of this many options that need to be priced. As I am going to explore parallelism in future posts, I decided to arrange the data into arrays. Thinking about parallelism has everything to do with the data, and how you handle it. I hope the arrays I've used will be adaptable to all the toolkits that will be explored, but time will tell. The program also only prints minimal data out at the end. I only print the last experiment, just to show that the calculations are actually done. A real pricing program would probably spit out all calculations for all experiments, or analyze and print out the optimal prices and maturity, and it would do this for all option positions in the portfolio. But I'm more interested in execution time.

The environment that the baseline program is run in, and where the results are from is an Intel Core2 Quad (Q6600) processor running at 2.4Ghz. The machine has 4 gigs of ram, 250gig hard drive, and a graphics processor. The operating system is CentOS 5.2 with all the updates applied. Not a bad little machine, lots of horsepower, you say! As the baseline program is only a single thread of execution, it runs substantially long on this machine as the performance run will show. One core of the quad processor is pegged at 100%, while the other 3 sit idly around doing a little housekeeping here and there. The program gets the job done, but not very efficiently.

So let's begin. Attached is a zip file containing the program and the Makefile.

Attachment 7


The test run was done using the Linux "Time" command to time the execution. I'm more interested in the overall program execution time, as opposed to some benchmarks I've seen that only time inner calculations. I mean when you're waiting for results it's the whole program that counts. It shows the timed execution on the Core2 Quad 2.4Ghz processor.

[leo@compute70 Baseline]$ time ./optionprice
=== Option Portfolio Calculations (Basline Test) ==========
Portfolio size : 1024
Experiments run per item : 1000000
Average Call Price : 36.868640
Average Put Price : 2.147528

real 18m49.352s
user 18m49.075s
sys 0m0.090s
[leo@compute70 Baseline]$


Here is a picture of "top" running to show that the execution of the program only exercises 1 core of the cpu.

Click image for larger version

Name:	baseline.jpg
Views:	558
Size:	64.2 KB
ID:	186


Well there you go, almost 19 minutes to perform all the calculations. You might say that's not so bad, I can go have a coffee while that's going on. But suppose you are the person responsible for managing this portfolio, and you have to make pricing decisions quickly. I doubt saying "come back in 20, while I figure this out", is acceptable. And this is only a small portfolio, with minimal experiments to determine optimal pricing.

Over the next few posts, i'll look at how to improve this execution time. Sure I can optimize the program, use compiler optimizations to hopefully speed it up, but I want bigger speed ups then that.

Talk to you soon.


Leo Stutzmann
Attached Thumbnails Attached Files

Updated August 11th, 2008 at 04:18 PM by Bearcat

Categories
General

Comments

  1. mehdi -
    mehdi's Avatar
    Hi Leo,

    Can't get the attachments (Invalid Attachment specified).

    Mehdi
    permalink
  2. Bearcat -
    Bearcat's Avatar
    Updated the attachments again, hopefully they will stick this time.

    Leo
    permalink

Trackbacks

Total Trackbacks 0
Trackback URL: