<?xml version="1.0" encoding="ISO-8859-1"?>

<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/">
	<channel>
		<title>HPC Community - High Performance Computing (HPC) Community - Blogs</title>
		<link>http://www.hpccommunity.org/blogs/</link>
		<description><![CDATA[HPCCommunity.org is a technical discussion HPC community portal for the High Performance Computing (HPC) community. The community includes Platform Computing R&D team members, architects and developers, external collaborators and a growing community of users and developers in the HPC world.]]></description>
		<language>en</language>
		<lastBuildDate>Thu, 02 Sep 2010 20:55:20 GMT</lastBuildDate>
		<generator>vBulletin</generator>
		<ttl>60</ttl>
		<image>
			<url>http://www.hpccommunity.org/images/misc/rss.jpg</url>
			<title>HPC Community - High Performance Computing (HPC) Community - Blogs</title>
			<link>http://www.hpccommunity.org/blogs/</link>
		</image>
		<item>
			<title>What is Platform Cluster Manager?</title>
			<link>http://www.hpccommunity.org/blogs/beowulf/what-platform-cluster-manager-104/</link>
			<pubDate>Wed, 06 May 2009 04:57:37 GMT</pubDate>
			<description>So Platform announced Platform Cluster Manager (PCM) this week. 
 
It is basically OCS rebranded. Why? 
 
1. better name to tell people what the...</description>
			<content:encoded><![CDATA[<blockquote class="blogcontent restore">So Platform announced Platform Cluster Manager (PCM) this week.<br />
<br />
It is basically OCS rebranded. Why?<br />
<br />
1. better name to tell people what the software is for (PCM is alot clearer than OCS)<br />
<br />
2. Align with rest of Platform's naming convention<br />
<br />
3. More catchy :-)<br />
<br />
We are planning on Kusu2 and PCM2 new features and capabilities now, and I believe it will give cluster admins a much more usable tool to address the whole lifecycle of the cluster.<br />
<br />
So while PCM now has no significant new features, it doesn't mean we are done... there are lot of ideas for Kusu2/PCM2.<br />
<br />
For current users of Kusu, OCS and PCM - do give us your feedback on what you would like to see in Kusu 2!<br />
<br />
Cheers!</blockquote>

 ]]></content:encoded>
			<dc:creator>beowulf</dc:creator>
			<guid isPermaLink="true">http://www.hpccommunity.org/blogs/beowulf/what-platform-cluster-manager-104/</guid>
		</item>
		<item>
			<title>Busy week! - Launch of Cloud Innovation Centre!</title>
			<link>http://www.hpccommunity.org/blogs/beowulf/busy-week-launch-cloud-innovation-centre-101/</link>
			<pubDate>Tue, 14 Apr 2009 04:19:20 GMT</pubDate>
			<description>So we launched the Cloud Innovation Center in Singapore during GridAsia 2009... see press release. 
 
The CIC will make use of Project Kusu/OCS as...</description>
			<content:encoded><![CDATA[<blockquote class="blogcontent restore">So we launched the Cloud Innovation Center in Singapore during <a href="http://gridasia.ngp.org.sg/2009/main.php" target="_self">GridAsia 2009</a>... see <a href="http://www.platform.com/press-releases/2009/platform-computing-to-open-cloud-computing-innovation-centre-in-singapore" target="_self">press release</a>.<br />
<br />
The CIC will make use of Project Kusu/OCS as one of the fundamental building blocks...<br />
<br />
We see Project Kusu evolving over time to address more than HPC.. as a provisioning tool.. I see it becoming a useful tool to people building clouds.</blockquote>

 ]]></content:encoded>
			<dc:creator>beowulf</dc:creator>
			<guid isPermaLink="true">http://www.hpccommunity.org/blogs/beowulf/busy-week-launch-cloud-innovation-centre-101/</guid>
		</item>
		<item>
			<title>So what was the hottest topic at SC08?</title>
			<link>http://www.hpccommunity.org/blogs/beowulf/so-what-hottest-topic-sc08-100/</link>
			<pubDate>Tue, 06 Jan 2009 03:39:40 GMT</pubDate>
			<description><![CDATA[For those lucky enough to be at SC08 and picked up the Platform's *_bjob_* t-shirt - good for you. 
 
Those not there and missed out on the buzz at...]]></description>
			<content:encoded><![CDATA[<blockquote class="blogcontent restore">For those lucky enough to be at SC08 and picked up the Platform's <b><u>bjob</u></b> t-shirt - good for you.<br />
<br />
Those not there and missed out on the buzz at SC08.. checked out the awesome T-shirt.<br />
<br />
<img src="http://www.hpccommunity.org/members/beowulf/albums/misc/102-front.jpg" border="0" alt="" /><br />
<br />
<img src="http://www.hpccommunity.org/members/beowulf/albums/misc/103-back.jpg" border="0" alt="" /><br />
<br />
Check back here regularly.. and you may just get yourself one of these T-Shirts... stay tuned...<br />
<br />
Cheers!</blockquote>

 ]]></content:encoded>
			<dc:creator>beowulf</dc:creator>
			<guid isPermaLink="true">http://www.hpccommunity.org/blogs/beowulf/so-what-hottest-topic-sc08-100/</guid>
		</item>
		<item>
			<title><![CDATA[Innovative Mac 'Cluster' usage]]></title>
			<link>http://www.hpccommunity.org/blogs/beowulf/innovative-mac-cluster-usage-99/</link>
			<pubDate>Tue, 23 Dec 2008 03:20:04 GMT</pubDate>
			<description><![CDATA[Came across this innovative way to use a Mac 'cluster'... 
 
YouTube - Mac vs PC 
 
Happy Holidays!]]></description>
			<content:encoded><![CDATA[<blockquote class="blogcontent restore">Came across this innovative way to use a Mac 'cluster'...<br />
<br />
<a href="http://www.youtube.com/watch?v=uLbJ8YPHwXM" target="_self">YouTube - Mac vs PC</a><br />
<br />
Happy Holidays!</blockquote>

 ]]></content:encoded>
			<dc:creator>beowulf</dc:creator>
			<guid isPermaLink="true">http://www.hpccommunity.org/blogs/beowulf/innovative-mac-cluster-usage-99/</guid>
		</item>
		<item>
			<title>Exploring HPC Programming: OpenMP</title>
			<link>http://www.hpccommunity.org/blogs/bearcat/exploring-hpc-programming-openmp-98/</link>
			<pubDate>Thu, 20 Nov 2008 12:55:54 GMT</pubDate>
			<description><![CDATA[Today, we'll have a quick look at OpenMP. OpenMP is a set of programming APIs, and compiler pragmas that support multi-platform, shared memory...]]></description>
			<content:encoded><![CDATA[<blockquote class="blogcontent restore">Today, we'll have a quick look at OpenMP. OpenMP is a set of programming APIs, and compiler pragmas that support multi-platform, shared memory multiprocessing programming in C/C++ and Fortran. The interesting thing about OpenMP, is that it is a very nice simple way to split loops (for, do) into tasks for multi-threading. Our program has a number of &quot;for&quot; loops in it, so that is the obvious avenue to explore for our particular program.<br />
<br />
Anyone can get a quick overview of OpenMP from Wikipedia here: <a href="http://en.wikipedia.org/wiki/OpenMP" target="_self">OpenMP - Wikipedia, the free encyclopedia</a> and the page contains a number of other references for additional information. There is also the OpenMP site: <a href="http://openmp.org/wp/" target="_self">OpenMP.org</a> for information. Read up, it's an interesting topic.<br />
<br />
First off, I'll profess that I'm not an OpenMP expert. I've learned enough by reading, trying, and experimenting that I can use OpenMP in it's basic form. To use OpenMP, you need a compiler that supports it. GCC 4.2 and up support OpenMP. Also Redhat has back ported OpenMP into the compiler supplied with Redhat 5.2. It is also the same compiler in CentOS 5.2. To check, you must have the omp.h include file, and the libgomp library.<br />
<br />
The really nice thing about OpenMP, is that it is much less intrusive on your program then converting the program to using threads. <br />
<br />
So let's get going.<br />
<br />
In previous articles, I changed a baseline program to multi-threaded giving a couple of options for attacking the problem. For the OpenMP example, I don't need the multi-threaded example, so I went back to the baseline example as a starting point. The multi-threaded examples executed in just under 5 minutes, so that will be a target. I don't expect to reach the target, but getting close would be nice, and convince me that OpenMP is a viable way of doing things.<br />
<br />
The first thing I did to the baseline example was replace the rand function (because it's not multi-threaded), with the distribution function I created for the multi-threaded example. Now we have a new starting point for OpenMP. <br />
<br />
The first step is to add the include file (who would have guessed), at the beginning of the source file:<br />
<br />
<div class="bbcode_container">
	<div class="bbcode_description">Code:</div>
	<pre class="bbcode_code" style="height:36px;">#include &lt;omp.h&gt;</pre>
</div> That's really all you need to do for preparation, really simple. I wanted to assure that OpenMP would start 4 threads on my Quad Core machine, so I added the following line in the &quot;main&quot; function of the program.<br />
<br />
<div class="bbcode_container">
	<div class="bbcode_description">Code:</div>
	<pre class="bbcode_code" style="height:36px;">omp_set_num_threads(4);</pre>
</div> That's it. Now to play with the pragmas.<br />
<br />
We have a &quot;for&quot; loop in the &quot;blackscholes&quot; function, so we'll try that first. After all it loops 1,000,000 times for each portfolio item, which there are 1024 of. This is the inner most loop that just performs the calculations. To unroll a &quot;for&quot; loop in OpenMP is quite simple, just place a pragma right before the &quot;for&quot; statement.<br />
<br />
<div class="bbcode_container">
	<div class="bbcode_description">Code:</div>
	<pre class="bbcode_code" style="height:48px;">   #pragma omp parallel for private(index)
   for (index = 0; index &lt; experiments; index++)</pre>
</div> I only put &quot;index&quot; in the private section, because I declared &quot;index&quot; outside the &quot;for&quot; loop. I'm not sure if I needed to do this. The private section of the pragma is to tell OpenMP, which variables are private to each thread unrolled from the &quot;for&quot; loop, such that OpenMP doesn't have to block and handle access to the variable from multiple threads.<br />
<br />
Everyone should read about variable scope in OpenMP, as it makes a big difference. I've also found that trying different combinations helps our understanding as well, and it's really simple, change the pragma line, compile, test, rinse, repeat. easy huh?<br />
<br />
So with these changes, and changes to the Makefile to compile, and include the correct libraries, how does this fair?<br />
<br />
Here are the results of the run:<br />
<br />
<div class="bbcode_container">
	<div class="bbcode_quote">
		<div class="quote_container">
			<div class="bbcode_quote_container"></div>
			
				[leo@bearcat1 OpenMP]$ time ./openmpprice<br />
=== Option Portfolio Calculations (OpenMP Test) ==========<br />
Portfolio size                    : 1024<br />
Experiments run per item   : 1000000<br />
Average Call Price             : 36.560187<br />
Average Put  Price             : 1.669589<br />
<br />
real    6m2.281s<br />
user    23m49.077s<br />
sys    0m0.302s
			
		</div>
	</div>
</div> Not Bad, 6 minutes and a couple of seconds, compared to the multi-threaded program at just under 5 minutes. Certainly a lot better then the original 19 minutes of the single threaded program. I'd say a GREAT result for a small effort.<br />
<br />
The other big loop is the &quot;for&quot; loop for the number of times the calculations are performed, in the &quot;portfolio&quot; function. To test this loop, I removed the pragma from the previous test run, and put a pragma on the &quot;portfolio&quot; loop, like this:<br />
<br />
 <div class="bbcode_container">
	<div class="bbcode_description">Code:</div>
	<pre class="bbcode_code" style="height:48px;">  #pragma omp parallel for private(i,j,pnum,snum,ynum)
   for (i = 0; i &lt; num_options; i++)</pre>
</div> This is the outer most loop, which also includes the initialization, and the &quot;blackscholes&quot; function call to performa the calculations. Here's the result:<br />
<br />
<div class="bbcode_container">
	<div class="bbcode_quote">
		<div class="quote_container">
			<div class="bbcode_quote_container"></div>
			
				[leo@bearcat1 OpenMP]$ time ./openmpprice<br />
=== Option Portfolio Calculations (OpenMP Test) ==========<br />
Portfolio size                    : 1024<br />
Experiments run per item   : 1000000<br />
Average Call Price             : 36.558490<br />
Average Put  Price             : 1.669706<br />
<br />
real    5m25.763s<br />
user    21m14.538s<br />
sys    0m0.239s
			
		</div>
	</div>
</div> Better, because I included more of the processing under OpenMP control. So 5 minutes and 25 seconds is not bad, and very close to the performance I got from hand coding my own threads.<br />
<br />
I'm impressed! Are you? A small effort, not intrusive, BIG gains.<br />
<br />
Two additional lines, and a pragma placed in the right spot, and OpenMP does a bang up job of multi-processing my program. I should note that during these 2 execution runs, my processors were pegged at 100% during the whole run, so the conversion of the program seems very efficient.<br />
<br />
Here's the program:<br />
<br />
<a href="http://www.hpccommunity.org/attachments/f21/190d1227185819-kusu-building-base-kit-openmp.zip" >OpenMP.zip</a><br />
<br />
OpenMP allows multiple pragmas, so if you have a program that has separate sections of calculations, then you can pragma the different sections to help speed them up. If you have older programs similar in structure such as this one, &quot;for&quot; loops, then OpenMP will certainly speed things up. As you learn more about OpenMP, I'm sure you'll find other uses for it, to help speed up other sections of your program.<br />
<br />
Have Fun,<br />
<br />
Leo Stutzmann</blockquote>

 ]]></content:encoded>
			<dc:creator>Bearcat</dc:creator>
			<guid isPermaLink="true">http://www.hpccommunity.org/blogs/bearcat/exploring-hpc-programming-openmp-98/</guid>
		</item>
		<item>
			<title>See you at SC08 next week!</title>
			<link>http://www.hpccommunity.org/blogs/csmith/see-you-sc08-next-week-97/</link>
			<pubDate>Thu, 13 Nov 2008 18:46:17 GMT</pubDate>
			<description><![CDATA[If you've been around the HPC space for a while, you'll be feeling the same sense of anticipation that I am for the Supercomputing conference (SC08)...]]></description>
			<content:encoded><![CDATA[<blockquote class="blogcontent restore">If you've been around the HPC space for a while, you'll be feeling the same sense of anticipation that I am for the Supercomputing conference (SC08) happening next week in Austin. Platform will be there (booth 1627), and I'll be in the booth from 12:30pm to 3:00pm every day, so come on by to say hello!<br />
<br />
I also want to let people know about the <a href="http://ogf.org/HPCBasicProfile" target="_self">HPC Profile</a> <a href="http://scyourway.nacse.org/conference/view/bof177" target="_self">BoF</a> that's happening on Wed from 5:30pm to 7:00pm, and invite all to come and see the exciting things happening in the world of HPC standards. <br />
<br />
See you next week!</blockquote>

 ]]></content:encoded>
			<dc:creator>csmith</dc:creator>
			<guid isPermaLink="true">http://www.hpccommunity.org/blogs/csmith/see-you-sc08-next-week-97/</guid>
		</item>
		<item>
			<title><![CDATA[Let's hope for Cloudy days ahead]]></title>
			<link>http://www.hpccommunity.org/blogs/csmith/lets-hope-cloudy-days-ahead-96/</link>
			<pubDate>Mon, 06 Oct 2008 17:57:05 GMT</pubDate>
			<description><![CDATA[Last week, I attended the "Cloud Computing and Beyond: The Web Grows Up (finally)" conference hosted by SDForum in Santa Clara. The purpose of the...]]></description>
			<content:encoded><![CDATA[<blockquote class="blogcontent restore">Last week, I attended the <a href="http://www.sdforum.org/index.cfm?fuseaction=Calendar.eventDetail&amp;eventID=13223&amp;pageId=471" target="_self">&quot;Cloud Computing and Beyond: The Web Grows Up (finally)&quot;</a> conference hosted by <a href="http://sdforum.org" target="_self">SDForum</a> in Santa Clara. The purpose of the conference seemed to be to provide a snapshot of where the software market currently sits with respect to cloud computing right now, as well as provide some vision as to where things are going. There was a lot of great content, mostly in the format of panels, which I enjoy quite a bit compared to just presentations. <br />
<br />
There were a couple of themes that jumped out at me during the day. First was the sentiment that cloud computing doesn't magically make your applications &quot;run better&quot;. It takes a good deal of forethought to properly design an application to make use of the Cloud. In fact, the type of thinking that is needed to make an application run well in the cloud can equally serve an application running on your own infrastructure, since much of what makes an application cloud-ready has to do with planning for scalability and reliability (building on cloud infrastructure just makes this more explicit). As somebody who has been helping people &quot;grid-enable&quot; applications for a long while now, this is not new, but I was happy to hear that nobody really believe in a cloud computing &quot;free lunch&quot;. <br />
<br />
The second interesting point that I took away from the conference was a provocative point made by one of the panelists in the &quot;Crawl/Walk/Run&quot; panel. To the question of &quot;what makes one service more cloudy than another?&quot;, Jason Hoffman from <a href="http://joyent.com/" target="_self">Joyent</a> made the contentious claim that nobody's service (including their own) was &quot;cloudy&quot; at all! He claimed that the fundamental property of &quot;cloudiness&quot; emerged when services in the cloud were truly transparently accessed by the end-user. He used as an example Amazon's S3 service. Right now, S3 isn't cloudy, because I knowingly access it remotely over the internet. In order to make a service cloudy, he imagined the following scenario:<br />
<ul><li>A company wants to provide S3 service to their users, but wants to carve out their own &quot;s3.mycompany.com&quot; domain for the services. They arrange this with Amazon so that it looks, to company users, like the service is provided by the company itself.</li>
<li>In order to enhance the performance of the system, and to exercise a little more control over their own destiny in the face of changes to Amazon's service, they deploy their own S3 internally (imagine Amazon providing an &quot;S3 appliance&quot; that they plug into their own data centre). This internal S3 is still provided to users as s3.mycompany.com.</li>
<li>Now, in order to realize some of the benefits of Amazon's hosted S3 service, this internal S3 and Amazon S3 can communicate, such that an outage in the internal service would be picked up by the remote service, etc, etc. To the end user, it all looks the same. s3.mycompany.com is where they get their storage service, regardless of whether the requests go to an internal box, or the remote S3 service.</li>
</ul><br />
Once you attain this level of transparency and seamless access, you have a service that is truly &quot;cloudy&quot;. End users consume a service, with the same ease-of-use and cost model as S3 provides, but with a mix and match infrastructure approach that can meet the needs of small, medium and large sized organizations. <br />
<br />
To achieve this transparency you need some level of interoperability (whether through common software stacks or standards remains to be seen), but once achieved, the true vision of &quot;cloud computing&quot; (elastic resources provided and charged by consumption) can be realized. <br />
<br />
So here's to cloudy days ahead!</blockquote>

 ]]></content:encoded>
			<dc:creator>csmith</dc:creator>
			<guid isPermaLink="true">http://www.hpccommunity.org/blogs/csmith/lets-hope-cloudy-days-ahead-96/</guid>
		</item>
		<item>
			<title>OGF24: Use cases drive standards!</title>
			<link>http://www.hpccommunity.org/blogs/csmith/ogf24-use-cases-drive-standards-95/</link>
			<pubDate>Mon, 06 Oct 2008 17:32:20 GMT</pubDate>
			<description><![CDATA[The Open Grid Forum recently held it's 24th meeting at the Biopolis in Singapore, co-located with GridAsia2008. OGF meetings are a great way to learn...]]></description>
			<content:encoded><![CDATA[<blockquote class="blogcontent restore">The <a href="http://www.ogf.org" target="_self">Open Grid Forum</a> recently held it's 24th meeting at the Biopolis in Singapore, co-located with <a href="http://gridasia.ngp.org.sg/2008/" target="_self">GridAsia2008</a>. OGF meetings are a great way to learn about Grid activities happening worldwide, and OGF24 was no exception. There was definitely cloud on the mind, as evidenced by a keynote from Peter Coffee of <a href="http://salesforce.com" target="_self">salesforce.com</a> and with some buzz around Singapore as one of the data centre locations for the HP/Intel/Yahoo cloud initiative. There was also a good, solid program around enterprise and e-research activities in Grid, with both a local and international focus.<br />
<br />
For my part, as VP of the standards function at OGF, I was happy to see continued efforts around converging standards activities in the compute space (JSDL/BES/GLUE/UR), as well as some emerging activity on providing more metering and control at the network layer in the &quot;grid stack&quot;.  I ran one session at OGF24 that was intended to highlight these activities. The (perhaps mis-named) <a href="http://ogf.org/gf/event_schedule/index.php?id=1415" target="_self">Introduction to OGF Standards</a> workshop was intended to present the standards work of OGF, not from a working groups and specifications point of view, but from the point of view of those who are building grid systems, and thus have a use case view of their needs. The intention of the workshop was to help answer the questions about how the alphabet soup of specifications come together to solve real-world problems. <br />
<br />
There were four main presentations:<ol class="decimal"><li>Federated Data Access</li>
<li>ISV Integration with Remote Computing</li>
<li>Job Submission and Management Using Meta-Schedulers</li>
<li>Network Monitoring and Usage</li>
</ol><br />
(The presentations are available from the session summary link above.)<br />
<br />
As someone involved in defining specifications at OGF, I must admit that we sometimes lose sight of the big picture in our working group sessions, so it is very useful to step back once in a while and see a landscape of how all these things fit together. We need to remember that use cases drive the standards activities, and not the other way around!</blockquote>

 ]]></content:encoded>
			<dc:creator>csmith</dc:creator>
			<guid isPermaLink="true">http://www.hpccommunity.org/blogs/csmith/ogf24-use-cases-drive-standards-95/</guid>
		</item>
		<item>
			<title>Red Hat chases Redmond with HPC play</title>
			<link>http://www.hpccommunity.org/blogs/beowulf/red-hat-chases-redmond-hpc-play-94/</link>
			<pubDate>Fri, 03 Oct 2008 02:51:02 GMT</pubDate>
			<description>Another story on Red Hat HPC and Kusu :) 
 
Red Hat chases Redmond with HPC play â€¢ The Register 
 
 
But to say that Red Hat chases Redmond is not...</description>
			<content:encoded><![CDATA[<blockquote class="blogcontent restore">Another story on Red Hat HPC and Kusu :)<br />
<br />
<a href="http://www.theregister.co.uk/2008/10/02/redhat_hpc_stack/" target="_self">Red Hat chases Redmond with HPC play â€¢ The Register</a><br />
<br />
<br />
But to say that Red Hat chases Redmond is not exactly right. It is more of Redmond chasing after the Linux HPC market.<br />
<br />
I have seen the new Windows HPC Server 2008 and I must say it is very nicely done with good integration of management and monitoring tools. But what they have delivered in Windows HPC Server 2008 is what the Linux HPC market has since 2006; GUI based heat-maps? alerts? monitoring? - been there, done that!<br />
<br />
Today, both the Windows and Linux crowd have the same set of tools and utilities to build and manage a cluster. <br />
<br />
The <b>real chase</b> starts now in making clusters easier for the end-users (not the system integrators, hardware suppliers or cluster administrators).<br />
<br />
This is where I feel the Linux crowd needs to pull together and work closely as a team. Microsoft's Visual Studio (VS) is a formidable opponent and with tight integrations of parallel tools into VS, it will give rise to a new generation of programmers who would be highly productive in the Windows world.<br />
<br />
Have a good weekend!</blockquote>

 ]]></content:encoded>
			<dc:creator>beowulf</dc:creator>
			<guid isPermaLink="true">http://www.hpccommunity.org/blogs/beowulf/red-hat-chases-redmond-hpc-play-94/</guid>
		</item>
		<item>
			<title>Red Hat HPC Cometh</title>
			<link>http://www.hpccommunity.org/blogs/beowulf/red-hat-hpc-cometh-93/</link>
			<pubDate>Thu, 02 Oct 2008 08:31:10 GMT</pubDate>
			<description>A nice little article from a friend in TheInquirer. 
 
Red Hat HPC Linux cometh - The INQUIRER 
 
Cheers!</description>
			<content:encoded><![CDATA[<blockquote class="blogcontent restore">A nice little article from a friend in TheInquirer.<br />
<br />
<a href="http://www.theinquirer.net/gb/inquirer/news/2008/09/26/redhat-hpc-linux-cometh" target="_self">Red Hat HPC Linux cometh - The INQUIRER</a><br />
<br />
Cheers!</blockquote>

 ]]></content:encoded>
			<dc:creator>beowulf</dc:creator>
			<guid isPermaLink="true">http://www.hpccommunity.org/blogs/beowulf/red-hat-hpc-cometh-93/</guid>
		</item>
		<item>
			<title>How to have sles run your own program on installation</title>
			<link>http://www.hpccommunity.org/blogs/george-goh/how-have-sles-run-your-own-program-installation-92/</link>
			<pubDate>Wed, 01 Oct 2008 16:13:00 GMT</pubDate>
			<description>1. Take the root image from a pristine SLES 10.2 repository, and copy 
the contents of the root image to a working location. 
 
*mkdir -p...</description>
			<content:encoded><![CDATA[<blockquote class="blogcontent restore">1. Take the root image from a pristine SLES 10.2 repository, and copy<br />
the contents of the root image to a working location.<br />
<br />
<font face="Courier New"><b>mkdir -p $WORKING_DIR/mnt $WORKING_DIR/rootimg<br />
mount -o loop $REPO_LOCATION/boot/i386/root $WORKING_DIR/mnt<br />
cp -avr $WORKING_DIR/mnt/* $WORKING_DIR/rootimg<br />
umount $WORKING_DIR/mnt</b></font><br />
<br />
2. Replace the original ‘yast’ script with my own script.<br />
<font face="Courier New"><b><br />
mv $WORKING_DIR/rootimg/sbin/yast $WORKING_DIR/rootimg/sbin/yast.real<br />
cp $WORKSPACE/yast.mine $WORKING_DIR/rootimg/sbin/yast</b></font><br />
<br />
3. When desired changes are made(add dependencies in step a), pack up the modified directory tree<br />
into a cramfs image:<br />
<font face="Courier New"><b><br />
mkfs.cramfs $WORKING_DIR/rootimg $REPO_LOCATION/boot/i386/root</b></font><br />
<br />
A. For Kusu dependencies, we need the following packages:<br />
<br />
Available on SLES installation disks:<br />
<b>python-2.4.2-18.13.i586.rpm<br />
slang-2.0.5-14.2.i586.rpm</b><br />
<br />
Available from other sources:<br />
<a href="http://mondorescue.linjection.org/ftp/sles/10/newt-0.52.2-1.2.i586.rpm" target="_blank">http://mondorescue.linjection.org/ft...2-1.2.i586.rpm</a><br />
<br />
To install an rpm in a rootimg, python, for example,<br />
<font face="Courier New"><b>pushd $WORKING_DIR/rootimg<br />
rpm2cpio $RPMDIR/python-2.4.2-18.13.i586.rpm | cpio -idv<br />
popd</b></font></blockquote>

 ]]></content:encoded>
			<dc:creator>George Goh</dc:creator>
			<guid isPermaLink="true">http://www.hpccommunity.org/blogs/george-goh/how-have-sles-run-your-own-program-installation-92/</guid>
		</item>
		<item>
			<title>Grid Reliability and Scalability</title>
			<link>http://www.hpccommunity.org/blogs/zane_hu/grid-reliability-scalability-91/</link>
			<pubDate>Thu, 25 Sep 2008 21:11:21 GMT</pubDate>
			<description>What benefits could people get from grid computing? I would say, 
 
* The first benefit is high performance and scalability that comes from the...</description>
			<content:encoded><![CDATA[<blockquote class="blogcontent restore">What benefits could people get from grid computing? I would say,<br />
<ul><li>The first benefit is high performance and scalability that comes from the massive parallelism of many machines working together to finish compute workloads in parallel.</li>
</ul><ul><li>The second benefit is ultimate reliability that comes from its managed clustering technology, in which a cluster of machines are connected and managed together to provide redundancy to each other. As long as there is one or more machines alive in the cluster, the grid should be able to continue serving for existing and new requests with its remaining capacity.</li>
</ul><br />
Let’s look at a system that has N components. Such a system can be a cluster of computers, or a parallel application. From a scalability point of view, it is a simple math that people would want to increase N for a high degree of parallelism to speed up performance.<br />
<br />
However, from a reliability point of view, it is not always the case. Depending on how a system is designed, increasing N may decrease the system reliability. Let’s say C is the failure probability of a single component, and S is the failure probability of the system. <br />
<ul><li>If a system can keep up running only when all its components are up running, then S = 1-(1-C)^N. Increasing N will decrease reliability. Examples of this “all-up” system are the traditional MPI parallel applications.</li>
</ul><ul><li>If a system can keep up running whenever any one or more of its components are up running, then S = C^N. Increasing N will increase reliability as well. Examples of this “any-up” system are Platform Symphony and its SOA applications.</li>
</ul><br />
This “any-up” system design philosophy has been applied throughout Platform Symphony and its application programming models for the best of both scalability and reliability.<br />
<br />
Starting from Platform EGO that is the foundation layer of Platform Symphony, a list of master candidate hosts can be configured to form a “mini cluster” inside an EGO cluster. As long as there is at least one master candidate host is up, the EGO master daemon will be up running to manage any up-running hosts in an EGO cluster as available resources for system components and applications on top of EGO.<br />
<br />
To manage different systems and applications that share resources on a large scale grid, EGO introduces a concept of resource consumers. Any resource consumer can request a “mini cluster” of resources to run something for scalability and reliability. For example, there is an EGO management service consumer to run any EGO system services registered in EGO, such as Platform Management Console for web GUI, EGO Web Service Gateway, and so on. If such an EGO service is failed, EGO will restart it either on the same host or another available host.<br />
<br />
To run Platform SOAM (Service Oriented Application Middleware, aka Symphony DE) and its applications, Platform SOAM Session Director, which is registered as an EGO service, has a consumer to run SOAM Session Managers for SOAM applications. If a Session Manager is failed, the Session Director will restart it either on the same host or another available host. Finally each Session Manager has an application consumer to run SOA workloads on compute nodes. If an application service is failed, its Session Manager will restart it either on the same host or another available host.<br />
<br />
Each of these consumers forms a dynamic virtual “mini cluster” that dynamically requests and returns physical resources from EGO, based on resource allocation plans, priorities, runtime workload demands, and reliability requirements. Each “mini cluster” can keep up running whenever any one or more of its nodes are up running. When one node is failed in a “mini cluster”, its manager running in another “mini cluster” will ask EGO for a new node if necessary, to maintain the same level of scalability and reliability.<br />
<br />
Like Platform SOAM, any other systems or applications can integrate with Platform EGO for both scalability and reliability.<br />
<br />
In next post, I will talk more about grid reliability, such as high availability, persistent and non-persistent redundancy, and shared storage.</blockquote>

 ]]></content:encoded>
			<dc:creator>zane_hu</dc:creator>
			<guid isPermaLink="true">http://www.hpccommunity.org/blogs/zane_hu/grid-reliability-scalability-91/</guid>
		</item>
		<item>
			<title>Cloud Computing: Opportunities for HPC to go mainstream?</title>
			<link>http://www.hpccommunity.org/blogs/khalid/cloud-computing-opportunities-hpc-go-mainstream-90/</link>
			<pubDate>Mon, 22 Sep 2008 15:53:56 GMT</pubDate>
			<description>To say there is a lot of hype around Clouds would be an understatement. With all the buzz in the market and the visibility that big players like...</description>
			<content:encoded><![CDATA[<blockquote class="blogcontent restore">To say there is a lot of hype around Clouds would be an understatement. With all the buzz in the market and the visibility that big players like Amazon, Google, Microsoft, IBM, VMware etc have given to cloud computing, it seems hard to ignore.  There is enough debate in the blogosphere on the various definitions of Cloud computing and  whether its real or just marketing hype. Remember  “Utility Computing” or “On-Demand Computing” around a few years back? Without getting bogged down in rigorous definitions – you can just Google those - lets look at what this general trend could mean for HPC environments. <br />
<br />
Because the Amazon EC2 is seen as a poster child for Cloud Computing, I want to look at some of its characteristics and see how it might impact how HPC is delivered:<br />
<br />
<b>Infrastructure As A Service:</b> Amazon uses an underlying virtualization platform based on Xen to and their S3  storage system to allow customers to create servers on-demand. Rather than having to rely on a corporate IT department to procure, install and wire servers into the data center, we have the notion of self-service, where the IT middleman is taken out of the loop. Users can go to a portal and make a request for N servers with specific hardware or software characteristics and have them provisioned automatically in a matter of minutes. When no longer needed, the underlying resources are put back into the cloud to service the next customer. This notion of disposable computing is one of the reasons the cloud model can support innovation. By making acquiring and retiring resources a light-weight operation and orienting it at end-users, getting access to computing infrastructure is not a barrier to research. This is quite different from traditional environments where there is a long procurement process and a need to justify the expenses including space, cooling, power, management costs that go into setting up a compute cluster.<br />
<br />
<b>Pay-As-You-Go </b>The notion of paying for what resources you use is common to the utility model in many industries like water, electricity. The model can be applied to computing infrastructure in order to lower the capital expenditures required to set up and operate a cluster of machines. For this to work, the cloud system must track the granular usage of resources so that  proper bills can be made. The idea of rates which can vary with demand is another aspect to consider in a market-based pricing model.  <br />
<br />
Most HPC environments have the ability to track resource usage for accounting and billing. However this is often tied to a specific type of workload (batch jobs), and the notion of chargeback is not always very formal.  If the cloud model is adopted, resource usage is metered so that individual users or groups get a bill on their credit cards at the end of the month just like their cell phone plans. Although this is probably a bit far fetched for most enterprise HPC environments, the notion of paying for what you actually use gives an economic incentive to avoid wasting cycles and reducing costs. <br />
<br />
<b>Scalability </b>Cloud providers such as Amazon are by design to scale across multiple data centers and geographies. Scaling is also one of the characteristics of HPC environments, but typically these are confined to single large data centers. Only a few sites implement multi-geography grids and even those are restricted to batch workloads. The challenge of scaling storage across geographies like the Amazon S3 storage system does is something HPC users have probably encountered. S3 provides  a different abstraction than traditional cluster filesystems, which may help it to scale across geographies.  <br />
<br />
<br />
While the Cloud trend is interesting, it is not clear how much of an impact it will have on HPC. Its not likely that the highly optimized clusters with specialized hardware interconnects, and tuned software configurations will all migrate to external clouds. However, the notion of internal or private clouds which reside within the firewall is much more feasible. The issues of data access and security are much easier to address within the perimeter of an enterprise.  External clouds may still be useful as a temporary overflow pool for peak usage for certain applications. <br />
<br />
In my opinion, HPC administrators already running large clusters or grids for engineering and scientific applications have an opportunity to become a more significant aspect of the corporate IT landscape by expanding to become internal cloud providers. HPC administrators already have expertise in scaling, resource sharing and usage-based accounting that targets internal enterprise users. A lot of the underlying infrastructure can potentially be re-used, along with technologies that help to more readily fit into the broader enterprise IT landscape. This convergence of HPC and enterprise IT infrastructures could be one of the possibilities that results from Cloud Computing models.<br />
<br />
Well that’s enough fluffy thoughts on Cloud Computing for now :). In my next blog I hope to drill into a few of the technology areas that would be required to build an internal cloud infrastructure.</blockquote>

 ]]></content:encoded>
			<dc:creator>Khalid</dc:creator>
			<guid isPermaLink="true">http://www.hpccommunity.org/blogs/khalid/cloud-computing-opportunities-hpc-go-mainstream-90/</guid>
		</item>
		<item>
			<title>Exploring HPC Programming: Multi-threading Pt. 2</title>
			<link>http://www.hpccommunity.org/blogs/bearcat/exploring-hpc-programming-multi-threading-pt-2-89/</link>
			<pubDate>Tue, 02 Sep 2008 12:18:30 GMT</pubDate>
			<description>In the previous article about multi-threading, I mentioned that there are 2 options for breaking up the work that needs to be done. Option 1 was to...</description>
			<content:encoded><![CDATA[<blockquote class="blogcontent restore">In the previous article about multi-threading, I mentioned that there are 2 options for breaking up the work that needs to be done. Option 1 was to have each thread work on a subset of the portfolio, but do all the experiments. Option 2 was to have each thread work on a subset of the experiments, and to work on all of the portfolio.<br />
<br />
Today, we'll look at option 2, and I'll try to explain what was needed in the program to accomplish this task. Option 1 and option 2 programs will look similar on the surface, but will differ in the details. We still need to think about how our data is arranged and how we can access the data, without incurring any performance penalty, by restricting access to the data. Remember again, It's all about the data. (I can't stress that enough).<br />
<br />
So let's begin:<br />
<br />
The &quot;main&quot; function stays the same. The &quot;create_portfolio_threads&quot; function is changed to accomplish the goal. In option 1, I created arrays for each of the threads, because each thread operated on all the experiments which are housed in the arrays. In option 2, each thread will operate on a subset of the experiments, and a subset of the array, so I only need 1 set of arrays again. I added an item &quot;thexperiments&quot; to the thread structure to allow the threads to calculate the area of the array to work on.<br />
<br />
The next set of changes were made to the &quot;portfolio&quot; function, in which the area of the array to work on is based on the number of experiments, and the number of threads that will be used. Also I needed to change the average calculations to be based on the area of the array that was used, instead of the number of experiments.<br />
<br />
The last change was to the &quot;blackscholes&quot; function, to which I now pass the start and number of the indexes of the array to be worked on, instead of the number of experiments.<br />
<br />
That's it, not too hard, was it?<br />
<br />
Here is the code:<br />
<br />
<a href="http://www.hpccommunity.org/attachment.php?attachmentid=189&amp;d=1220358150" >MultiThread2.zip</a><br />
<br />
Let's run it and see how it does:<br />
<br />
[leo@compute70 MultiThread2]$ time ./mthread2<br />
=== Option Portfolio Calculations (Threading over number of Experiments Test) ==========<br />
Portfolio size             : 1024<br />
Experiments run per item   : 1000000<br />
Number of threads started  : 4<br />
Thread 0, Running Experiments 1 to 250000 for 1024 options.<br />
Thread 1, Running Experiments 250001 to 500000 for 1024 options.<br />
Thread 2, Running Experiments 500001 to 750000 for 1024 options.<br />
Thread 3, Running Experiments 750001 to 1000000 for 1024 options.<br />
Thread 0, Average Call Price         : 36.561064<br />
Thread 0, Average Put  Price         : 1.669508<br />
Thread 3, Average Call Price         : 36.829648<br />
Thread 3, Average Put  Price         : 2.142390<br />
Thread 1, Average Call Price         : 36.854430<br />
Thread 1, Average Put  Price         : 2.156009<br />
Thread 2, Average Call Price         : 36.809930<br />
Thread 2, Average Put  Price         : 2.150635<br />
<br />
real    4m50.480s<br />
user    19m3.825s<br />
sys     0m0.077s<br />
[leo@compute70 MultiThread2]$ <br />
<br />
<br />
There you go, 4 minutes and 50 seconds. This is about the same length of time that option 1 took to execute. So what does that tell us?<br />
<br />
It means, that this sample (and only this sample) has a certain number of calculations to do, and depending on how I slice up the work, each thread has a certain number of calculations to do, so all things being equal, if I slice up the calculations 4 ways, regardless of how I slice it up, it will take a specific amount of time to perform the calculations. So I have 1,000,000 * 1024 calculations to perform. I can slice it up by option 1 as 1,000,000 * 256 * 4, or option 2 as 250,000 * 1024 * 4, but the end result is the same.<br />
<br />
Each option may have an advantage to a particular program, depending on the program you want to make multi-threaded, and how the data is arranged, so it's nice to have a couple of options, knowing the end result is the same, from a performance standpoint.<br />
<br />
So, can we make this program any faster? Well, maybe, if we can run more threads, say 8, if we had hardware that big. Or, we could try to make the calculations run faster, if we had faster hardware, or maybe alternative hardware.<br />
<br />
I'll explore these additional methods in future articles. <br />
<br />
Multi-threading has been around for a long time, still not that well understood, I think, based on programs I've seen. Next I'll look at a newer piece of technology, that is supposed to make multi-threading easier, although I don't find multi-threading that hard to begin with.  OpenMP, is technology that allows you to thread tasks within your program, to make use of multi-threading. I'll take a look at the implications of using OpenMP in this program, and see how it performs.<br />
<br />
See you next time.<br />
<br />
Leo Stutzmann</blockquote>

 ]]></content:encoded>
			<dc:creator>Bearcat</dc:creator>
			<guid isPermaLink="true">http://www.hpccommunity.org/blogs/bearcat/exploring-hpc-programming-multi-threading-pt-2-89/</guid>
		</item>
		<item>
			<title>Exploring HPC Programming: Multi-threading</title>
			<link>http://www.hpccommunity.org/blogs/bearcat/exploring-hpc-programming-multi-threading-88/</link>
			<pubDate>Mon, 25 Aug 2008 16:38:47 GMT</pubDate>
			<description>One of the questions I hear a lot is: Is multi-threaded programming difficult? 
 
The answer to that question is: It depends. Not really an answer,...</description>
			<content:encoded><![CDATA[<blockquote class="blogcontent restore">One of the questions I hear a lot is: Is multi-threaded programming difficult?<br />
<br />
The answer to that question is: It depends. Not really an answer, but it does depend on a number of factors. The multi-threaded api is fairly simple, so coding a multi-threaded application is easy. What’s difficult depends on your application, what it does, and how the data is used and structured. If your application uses a lot of shared data, which you need to control access to, with semaphores and mutexes, then it becomes more difficult, and prone to error, as threads waiting to access data, slow down the process, or worse become dead-locked if you have multiple shared data items, and are not careful. These are just details however, and multi-threaded programming is really easy, you just have to think about it before you write your code.<br />
<br />
In a previous post: “Where to start”, there is a sample program, so it’s best to start with that, and think about how we would go about breaking this down into threads of execution. Ah ha, you say, the program has 2 big loops that are ideal candidates. One loop is used to calculate 1 million experiments, and the second loop is to do this 1024 times to simulate a portfolio of calculations.<br />
<br />
Before we begin, let’s make the assumption, that because I’m going to run the program on a Quad Core machine, that the optimal number of threads will be 4 (who would have guessed). We’ll create the program to be somewhat flexible in how many threads it creates, but the default will be 4.<br />
<br />
If you look at the sample, there appears to be two different strategies that one could take to process the calculations in a threaded manner. Option 1: To have each thread process a subset of the portfolio. When using 4 threads, this would have each thread process 256 items in the portfolio. Option 2: To have each thread process a subset of the experiments. When using 4 threads, this would have each thread process 250,000 of the calculations. Since doing both would make this exercise too long, I’ll choose Option 1. I think investigating Option 2 in a future post is worthwhile, and will provide a nice comparison of the differences made to each program to accomplish the other scenario.<br />
<br />
Now that I have a strategy of how I want to parallelize this program, I can start thinking of the data, and how it’s going to be manipulated. I will try and keep the functions as similar as possible to the original sample, and add helper functions to smooth the transition to multiple threads. For Option 1: I’m going to try and keep the inner function “blackscholes” the same, and add the supporting code around it.<br />
<br />
The ideal way to have multiple threads running in a high performance program, is to have each thread using its own data, without any overlap or contention of data. This is an important strategy. It will make the program a little more complicated in the setup of data, but will maximize the processing, because the threads will not be waiting for shared data items that can only be accessed one at a time.<br />
<br />
The threading API is basic, and doesn’t take a list of parameters. It does take a pointer, so if I need to pass a thread all the items for it to do its calculations, the easiest way is to create a structure of things, and pass the thread the pointer to the structure. You will see this structure definition at the beginning of the program. In the original sample, I used arrays to store the calculations for the experiments, and this works out quite nicely for multi-threaded programs, although now that I have 4 sets of calculations running, I will need more arrays.<br />
<br />
The first change to the program is in the “main” function, which I recoded a little for clarity of parameters, and all it does is call a new helper function I added to create the threads (create_portfolio_thread). This function does all the setup and tear down of the threads that are going to be doing the calculations. Now, instead of creating the arrays to hold the calculations, each thread will need its own set of arrays to make each thread independent of the other threads, and not share any data. Instead of the 5 arrays of numbers, I now have to create (5 * number of threads) of arrays. This is done using a pointer to the address of the array, which are allocated by using the number of threads for its size. Then the arrays are allocated, and the pointers saved in the pointers to the address. Quite Simple. All these pointers are saved in the thread structure for each thread, and the “pthread_create” API is used to create the execution thread, and pass the structure of data that the thread will work on. Then we wait on each thread to complete, and finish.<br />
<br />
The “portfolio_thread” function is a helper that casts the passed data, and calls the “portfolio” function.<br />
<br />
The “portfolio” function looks at the passed data to operate on, and decides which of the 1024 items, based on the number of threads that will be running, and calculates the range of items it will perform all the experiments on. It does this based on the number of threads, so you don’t have to change anything, if you change the number of threads to create. It then does the same thing the portfolio function did in the previous sample, but instead of 1024 items, it only does the items for this thread.<br />
<br />
That’s it. Now that wasn’t so hard, was it. Here is the program:<br />
<br />
<a href="http://www.hpccommunity.org/attachment.php?attachmentid=9" >Attachment 9</a><br />
<br />
<br />
Let’s run this program and see how well we did. The original program took almost 19 minutes. 4 thread should take under 5 minutes, elapsed time. OK, here goes:<br />
<br />
__________________________________________________  ___________________________________<br />
<br />
[leo@compute70 MultiThread1]$ time ./mthread1<br />
=== Option Portfolio Calculations (Threading over number of options Test) ==========<br />
Portfolio size             : 1024<br />
Experiments run per item   : 1000000<br />
Number of threads started  : 4<br />
Thread 0, Running Options 1 to 256 doing 256 iterations.<br />
Thread 3, Running Options 769 to 1024 doing 256 iterations.<br />
Thread 1, Running Options 257 to 512 doing 256 iterations.<br />
Thread 2, Running Options 513 to 768 doing 256 iterations.<br />
Thread 1, Average Call Price         : 36.836399<br />
Thread 1, Average Put  Price         : 2.151642<br />
Thread 2, Average Call Price         : 36.843320<br />
Thread 2, Average Put  Price         : 2.146836<br />
Thread 3, Average Call Price         : 36.849425<br />
Thread 3, Average Put  Price         : 2.141625<br />
Thread 0, Average Call Price         : 36.815204<br />
Thread 0, Average Put  Price         : 2.144223<br />
<br />
real    13m7.594s<br />
user    28m24.214s<br />
sys     22m30.525s<br />
__________________________________________________  ___________________________________<br />
<br />
Oh oh, Houston, we have a problem. While the program was faster at 13 minutes 7 seconds, that’s not even close to what I was expecting. My coding kung-fu is failing me. When you look at the above run, it also shows that 28 minutes were spent in user time, which is ok, because I was using 4 cores, but 22 and a half minutes were spent in system time, this is not good, why would the program spend so much time in system. You could get out the profiler and it will show you exactly where it is spending the time, but I have my own suspicions. In Linux the random functions are not thread safe, and when compiling the program for threading, the random functions serialize their execution, because the function depends on a global variable.<br />
<br />
I need to get rid of the random system functions. In the next sample, I just changed the “RandFloat” function to provide an even distribution between low and high values, based on the thread that’s calling it, so each thread gets a slightly different even distribution from the others. Again, this is fine for our tests, but does not simulate a real options pricing program.<br />
<br />
Here is the changed program:<br />
<br />
<a href="http://www.hpccommunity.org/attachment.php?attachmentid=188&amp;d=1219682441" >MultiThread1x.zip</a><br />
<br />
<br />
Lets run it again with the random system function eliminated:<br />
<br />
__________________________________________________  ___________________________________<br />
<br />
[leo@compute70 MultiThread1x]$ time ./mthread1<br />
=== Option Portfolio Calculations (Threading over number of options Test) ==========<br />
Portfolio size             : 1024<br />
Experiments run per item   : 1000000<br />
Number of threads started  : 4<br />
Thread 0, Running Options 1 to 256 doing 256 iterations.<br />
Thread 1, Running Options 257 to 512 doing 256 iterations.<br />
Thread 3, Running Options 769 to 1024 doing 256 iterations.<br />
Thread 2, Running Options 513 to 768 doing 256 iterations.<br />
Thread 1, Average Call Price         : 36.851565<br />
Thread 1, Average Put  Price         : 2.156646<br />
Thread 0, Average Call Price         : 36.558490<br />
Thread 0, Average Put  Price         : 1.669706<br />
Thread 2, Average Call Price         : 36.807067<br />
Thread 2, Average Put  Price         : 2.153668<br />
Thread 3, Average Call Price         : 36.826722<br />
Thread 3, Average Put  Price         : 2.140491<br />
<br />
real    4m41.657s<br />
user    18m35.799s<br />
sys     0m0.126s<br />
__________________________________________________  ___________________________________<br />
<br />
<br />
Now that’s better. 4 minutes and 41 seconds, using 18 and a half minutes of user time on 4 cores.<br />
<br />
With these multi-threading techniques, the time has come down to about 5 minutes, from 19 minutes in the single process example. Hope I’ve given you some ideas on how to incorporate multi-threading in your program. Got to watch out for those system functions though, or anything that will serialize your program execution.<br />
<br />
Cheers for now, and happy coding.<br />
<br />
Leo Stutzmann</blockquote>


<!-- attachments -->
	<div class="blogattachments">
		
		
		
		
			<fieldset class="blogcontent">
				<legend>Attached Files</legend>
				<ul>
					
				</ul>
			</fieldset>
		

	</div>
<!-- / attachments -->
 ]]></content:encoded>
			<dc:creator>Bearcat</dc:creator>
			<guid isPermaLink="true">http://www.hpccommunity.org/blogs/bearcat/exploring-hpc-programming-multi-threading-88/</guid>
		</item>
		<item>
			<title>PVFS version 2, first try, part 2</title>
			<link>http://www.hpccommunity.org/blogs/mehdi/pvfs-version-2-first-try-part-2-87/</link>
			<pubDate>Fri, 22 Aug 2008 04:49:24 GMT</pubDate>
			<description>The next step is to  propagate the config on the nodes (compute-00-00, compute-00-01, compute-00-02). The installer is not part of the server pool...</description>
			<content:encoded><![CDATA[<blockquote class="blogcontent restore">The next step is to  propagate the config on the nodes (compute-00-00, compute-00-01, compute-00-02). The installer is not part of the server pool (will copy the fs-conf file on clients only later).<br />
<br />
We want a rc script to start the server, so just edit the one provided in the example directory:<br />
<br />
<font color="Blue">--------------------------------------------------------</font><br />
<b>cp examples/pvfs2-server.rc examples/pvfs2-server</b><br />
<font color="Blue">--------------------------------------------------------</font><br />
<br />
and the diff is:<br />
<br />
<font color="Blue">--------------------------------------------------------</font><br />
[mbozzore@stakhanov pvfs-2.6.3]$ <b>diff -U4 examples/pvfs2-server.rc examples/pvfs2-server</b><br />
--- examples/pvfs2-server.rc    2008-08-19 10:08:26.000000000 -0400<br />
+++ examples/pvfs2-server       2008-08-19 10:11:17.000000000 -0400<br />
@@ -13,9 +13,9 @@<br />
 # override this if your server binary resides elsewhere<br />
 PVFS2SERVER=/home/mbozzore/pvfs-2.6.3/sbin/pvfs2-server<br />
 # override this if you want servers to automatically pick a conf file,<br />
 #   but you just need to specify what directory they are in<br />
-PVFS2_CONF_PATH=/etc<br />
+PVFS2_CONF_PATH=/etc/pvfs2<br />
<br />
 # the server will record its PID in this file<br />
 PVFS2_PIDFILE=/var/run/pvfs2.pid<br />
<font color="Blue">--------------------------------------------------------</font><br />
<br />
Propagating the rc script is quite easy (don't forget to change the permissions):<br />
<br />
<font color="Blue">--------------------------------------------------------</font><ul><li><b>chown root.root /VM/mbozzore/PVFS/pvfs-2.6.3/examples/pvfs2-server ; chmod u+x /VM/mbozzore/PVFS/pvfs-2.6.3/examples/pvfs2-server</b></li>
<li><b>for i in $liste; do scp /VM/mbozzore/PVFS/pvfs-2.6.3/examples/pvfs2-server $i:/etc/rc.d/init.d/;done</b></li>
</ul><font color="Blue">--------------------------------------------------------</font><br />
<br />
Of course IO and metadata servers are in $liste<br />
<br />
Ouch, I forgot something: pvfs2-server is dynamically linked with libdb-4.6 so we need to create a new entry under /etc/ld.so.conf.d for BerkeleyDB (I mean ... on the nodes): <br />
<br />
<ol class="decimal"><li>create the file in a shared location (under a user account)<br />
<br />
<font color="Blue">--------------------------------------------------------</font><br />
[root@stakhanov ~]# <b>cat /home/mbozzore/tmp/BerkeleyDB.conf</b><br />
<br />
/home/mbozzore/db-4.6.21/lib/<br />
<font color="Blue">--------------------------------------------------------</font></li>
<li>propagate the file (using pdsh this time)<br />
<br />
<font color="Blue">--------------------------------------------------------</font><br />
<b>pdsh -a &quot;cp /home/mbozzore/tmp/BerkeleyDB.conf /etc/ld.so.conf.d/BerkeleyDB.conf</b><br />
<font color="Blue">--------------------------------------------------------</font></li>
<li>run ldconfig on the nodes</li>
</ol><br />
<font color="Blue">--------------------------------------------------------</font><br />
<b>pdsh -a &quot;ldconfig&quot;</b><br />
<font color="Blue">--------------------------------------------------------</font><br />
<br />
Done !! <br />
<br />
It is now time for initialization:<br />
<br />
<font color="Blue">--------------------------------------------------------</font><br />
[root@stakhanov ~]# <b>for i in $liste; do ssh -x $i &quot;/home/mbozzore/pvfs-2.6.3/sbin/pvfs2-server /etc/pvfs2/pvfs2-fs.conf /etc/pvfs2/pvfs2-server.conf-$i -f&quot;;done</b><br />
[D 07:02:04.125320] PVFS2 Server version 2.6.3 starting.<br />
[D 08/25 07:02] PVFS2 Server: storage space created. Exiting.<br />
[D 20:27:29.628509] PVFS2 Server version 2.6.3 starting.<br />
[D 08/25 20:27] PVFS2 Server: storage space created. Exiting.<br />
[D 22:58:03.677750] PVFS2 Server version 2.6.3 starting.<br />
[D 08/19 22:58] PVFS2 Server: storage space created. Exiting.<br />
<font color="Blue">--------------------------------------------------------</font><br />
<br />
<br />
Then, we can start the server:<br />
<br />
<font color="Blue">--------------------------------------------------------</font><br />
[root@stakhanov ~]# <b>for i in $liste; do ssh -x $i &quot;service pvfs2-server start&quot; ; done</b><br />
Starting PVFS2 server: [D 07:07:50.038126] PVFS2 Server version 2.6.3 starting.<br />
[  OK  ]<br />
Starting PVFS2 server: [D 20:36:08.930395] PVFS2 Server version 2.6.3 starting.<br />
[  OK  ]<br />
Starting PVFS2 server: [D 23:06:40.730802] PVFS2 Server version 2.6.3 starting.<br />
[  OK  ]<br />
<font color="Blue">--------------------------------------------------------</font><br />
<br />
Looks fine (after ps on the nodes and log check).<br />
<br />
<br />
Let's take care of the clients now. <br />
<br />
Hmmm ... pvfs2-server is not dynamically linked with the pvfs2 library, but pvfs2-client is. We need to create a new entry under /etc/ld.so.conf.d/ on the nodes. Same story again:<br />
<ol class="decimal"><li>create the file<br />
<br />
<font color="Blue">--------------------------------------------------------</font><br />
[root@stakhanov ~]# <b>cat /home/mbozzore/tmp/pvfs-2.6.3.conf</b><br />
/home/mbozzore/pvfs-2.6.3/lib<br />
<font color="Blue">--------------------------------------------------------</font></li>
<li>propagate the file<br />
<br />
<font color="Blue">--------------------------------------------------------</font><br />
[root@stakhanov ~]#<b>pdsh -a &quot;cp /home/mbozzore/tmp/pvfs-2.6.3.conf /etc/ld.so.conf.d/pvfs-2.6.3.conf&quot;</b><br />
<font color="Blue">--------------------------------------------------------</font><ul><li>check the cache on one node before updating it</li>
</ul><blockquote><font color="Blue">--------------------------------------------------------</font><br />
    [root@compute-00-01 ~]# <b>ldconfig -p | grep pvfs</b>    <br />
[root@compute-00-01 ~]#<br />
<font color="Blue">--------------------------------------------------------</font></blockquote></li>
<li>update the cache:<br />
<br />
<font color="Blue">--------------------------------------------------------</font><br />
[root@stakhanov ~]#<b>pdsh -a &quot;ldconfig&quot;</b><br />
<font color="Blue">--------------------------------------------------------</font><ul><li>check the cache again</li>
</ul><blockquote><font color="Blue">--------------------------------------------------------</font><br />
    [root@compute-00-01 ~]# <b>ldconfig -p | grep pvfs</b>        libpvfs2.so (libc6,x86-64) =&gt; /home/mbozzore/pvfs-2.6.3/lib/libpvfs2.so<br />
    [root@compute-00-01 ~]#<br />
<font color="Blue">--------------------------------------------------------</font></blockquote></li>
</ol><br />
Good, we can load the kernel module and start the client:<br />
<font color="Blue">--------------------------------------------------------</font>[/<br />
[root@stakhanov ~]<b>pdsh -a &quot;insmod /VM/mbozzore/PVFS/pvfs-2.6.3/src/kernel/linux-2.6/pvfs2.ko&quot;</b><br />
<br />
[root@stakhanov ~]<b>pdsh -a &quot;/VM/mbozzore/PVFS/pvfs-2.6.3/src/apps/kernel/linux/pvfs2-client -p /VM/mbozzore/PVFS/pvfs-2.6.3/src/apps/kernel/linux/pvfs2-client-core -L /var/log/pvfs2-client.log&quot;</b><br />
<font color="Blue">--------------------------------------------------------</font><br />
<br />
Note: if you forget to insert the pvfs2 kernel module before starting the pvfs2-client, you will get something like that in the client logs:<br />
<br />
<font color="Blue">###############################################</font><br />
[root@compute-00-03 ~]# cat /var/log/pvfs2-client.log<br />
[E 14:56:33.296570] Error: could not setup device /dev/pvfs2-req.<br />
[E 14:56:33.296740] <b>Error: did you remember to load the kernel module?</b><br />
[E 14:56:33.300203] pvfs2-client-core with pid 14238 exited with value 254<br />
[E 14:56:34.331904] Error: could not setup device /dev/pvfs2-req.<br />
[E 14:56:34.331972] <b>Error: did you remember to load the kernel module?</b><br />
<font color="Blue">###############################################</font><br />
<br />
Ok, ready to mount the filesystem then:<br />
<ul><li>create an entry under /mnt<br />
<br />
<font color="Blue">--------------------------------------------------------</font><br />
[root@stakhanov ~]# <b>pdsh -a &quot;mkdir /mnt/pvfs2&quot;</b><br />
<font color="Blue">--------------------------------------------------------</font></li>
<li>append what is needed to /etc/fstab<br />
<br />
<font color="Blue">--------------------------------------------------------</font><br />
[root@stakhanov ~]# <b>pdsh -a &quot;echo &quot;tcp://compute-00-01:3334/pvfs2-fs /mnt/pvfs2 pvfs2 defaults,noauto 0 0&quot; &gt;&gt; /etc/fstab&quot;</b><br />
<font color="Blue">--------------------------------------------------------</font></li>
<li>mount<br />
<br />
<font color="Blue">--------------------------------------------------------</font><br />
[root@stakhanov ~]# <b>pdsh -a &quot;mount /mnt/pvfs2&quot;</b><br />
<font color="Blue">--------------------------------------------------------</font></li>
</ul><br />
Looks good, we are now able to run basic tests.<br />
On any compute node:<br />
<br />
<font color="Blue">--------------------------------------------------------</font><br />
[root@compute-00-02 ~]# <b>/home/mbozzore/pvfs-2.6.3/bin/pvfs2-ping -m /mnt/pvfs2</b><br />
<br />
(1) Parsing tab file...<br />
<br />
(2) Initializing system interface...<br />
<br />
(3) Initializing each file system found in tab file: /etc/fstab...<br />
<br />
   PVFS2 servers: tcp://compute-00-01:3334<br />
   Storage name: pvfs2-fs<br />
   Local mount point: /mnt/pvfs2<br />
   /mnt/pvfs2: Ok<br />
<br />
(4) Searching for /mnt/pvfs2 in pvfstab...<br />
<br />
   PVFS2 servers: tcp://compute-00-01:3334<br />
   Storage name: pvfs2-fs<br />
   Local mount point: /mnt/pvfs2<br />
<br />
   meta servers:<br />
   tcp://compute-00-02:3334<br />
<br />
   data servers:<br />
   tcp://compute-00-00:3334<br />
   tcp://compute-00-01:3334<br />
<br />
(5) Verifying that all servers are responding...<br />
<br />
   meta servers:<br />
   tcp://compute-00-02:3334 Ok<br />
<br />
   data servers:<br />
   tcp://compute-00-00:3334 Ok<br />
   tcp://compute-00-01:3334 Ok<br />
<br />
(6) Verifying that fsid 398972398 is acceptable to all servers...<br />
<br />
   Ok; all servers understand fs_id 398972398<br />
<br />
(7) Verifying that root handle is owned by one server...<br />
<br />
   Root handle: 1048576<br />
     Ok; root handle is owned by exactly one server.<br />
<br />
==================================================  ===========<br />
<br />
<b>The PVFS2 filesystem at /mnt/pvfs2 appears to be correctly configured.</b><br />
<font color="Blue">--------------------------------------------------------</font><br />
<br />
Wonderful, it is now time to play with the test suite and MPI-IO.<br />
<br />
Let's check what is available in the test directory. There is a configure script available and the (partial) output of ./configure --help is:<br />
<br />
<font color="Blue">--------------------------------------------------------</font><br />
Optional Packages:<br />
  --with-PACKAGE[=ARG]    use PACKAGE [ARG=yes]<br />
  --without-PACKAGE       do not use PACKAGE (same as --with-PACKAGE=no)<br />
  --with-efence=&lt;path&gt;    Use electric fence for malloc debugging.<br />
  --with-mpi=&lt;dir&gt;        Location of the MPI installation<br />
  --with-pvfs2-src=&lt;dir&gt;  Location of the PVFS2 src directory<br />
  --with-pvfs2-build=&lt;dir&gt; Location of the PVFS2 build dir (if different from src dir)<br />
  --with-db=&lt;dir&gt;         Location of installed DB package (default=/usr)<br />
  --with-openssl=&lt;dir&gt;  Location of installed openssl package (default=/usr)<br />
                         --without-openssl     Don't build with openssl.<br />
<br />
  --with-libaio=&lt;dir&gt;  Location of installed libaio package (default=/usr)<br />
                         --without-libaio     Don't build with libaio.<br />
<font color="Blue">--------------------------------------------------------</font><br />
<br />
We will see in part 3 how painful can QA be :-)<br />
<br />
Mehdi Bozzo-Rey</blockquote>

 ]]></content:encoded>
			<dc:creator>mehdi</dc:creator>
			<guid isPermaLink="true">http://www.hpccommunity.org/blogs/mehdi/pvfs-version-2-first-try-part-2-87/</guid>
		</item>
		<item>
			<title>PVFS version 2, first try</title>
			<link>http://www.hpccommunity.org/blogs/mehdi/pvfs-version-2-first-try-85/</link>
			<pubDate>Fri, 22 Aug 2008 04:01:58 GMT</pubDate>
			<description><![CDATA[Let's play with PVFS (Parallel Virtual File System), and perform a manual install on an OCS 5 cluster. If we want to build a kit, then we should be...]]></description>
			<content:encoded><![CDATA[<blockquote class="blogcontent restore">Let's play with PVFS (Parallel Virtual File System), and perform a manual install on an OCS 5 cluster. If we want to build a kit, then we should be able to recompile the code (+dependencies), understand the configuration, and of course how to test it, right ?<br />
<br />
Two versions of are available; I will use version 2. The homepage is here: <br />
<br />
<a href="http://www.pvfs.org/" target="_self">Parallel Virtual File System, Version 2</a><br />
<br />
<br />
So, let's start with the latest and greatest version: PVFS-2.7.1<br />
<br />
Okay, it needs BerkeleyDB ... well just get the latest and greatest version (4.7).<br />
<br />
<br />
PVFS will compile just fine but surprise: I got the following during the initialization (in my case it was on the metadata server):<br />
<br />
<font color="Blue">###############################################</font><br />
[root@compute-00-02 ~]# <b>/opt/pvfs/sbin/pvfs2-server /etc/pvfs-fs.conf  -f</b><br />
[S 08/18 14:57] PVFS2 Server on node compute-00-02 version 2.7.1 starting...<br />
<b>[E 08/18 14:57] TROVE:DBPF:Berkeley DB: DB_THREAD mandates memory allocation flag on key DBT</b><br />
[E 08/18 14:57] error in dspace create (db_p-&gt;get failed).<br />
[E 08/18 14:57] TROVE:DBPF:Berkeley DB: DB_THREAD mandates memory allocation flag on key DBT<br />
[E 08/18 14:57] error in dspace create (db_p-&gt;get failed).<br />
[E 08/18 14:57] TROVE:DBPF:Berkeley DB: DB_THREAD mandates memory allocation flag on key DBT<br />
[E 08/18 14:57] error in dspace create (db_p-&gt;get failed).<br />
[E 08/18 14:57] TROVE:DBPF:Berkeley DB: DB_THREAD mandates memory allocation flag on key DBT<br />
[E 08/18 14:57] error in dspace create (db_p-&gt;get failed).<br />
[D 08/18 14:57] PVFS2 Server: storage space created. Exiting.<br />
<font color="Blue">###############################################</font><br />
<br />
Not really good ... well ... we can fall back to a previous version, like 4.6.21, right ?<br />
<br />
Anyway, it is really easy to recompile, just go in the source dir and:<br />
<br />
<font color="Blue">--------------------------------------------------------</font><br />
<b>cd build_unix ; ../dist/configure --prefix=/home/mbozzore/db-4.6.21 ; make make install</b><br />
<font color="Blue">--------------------------------------------------------</font><br />
<br />
Okay, ready to recompile (again) PVFS:<br />
<br />
<font color="Blue">--------------------------------------------------------</font><br />
[mbozzore@stakhanov pvfs-2.7.1]$ <b>./configure --prefix=/home/mbozzore/pvfs2 --with-db=/home/mbozzore/db-4.6.21/ --with-kernel=/usr/src/kernels/2.6.18-53.el5-x86_64/</b><font color="Blue">--------------------------------------------------------</font><br />
<br />
And at the end of the process:<br />
<br />
<font color="Blue">###############################################</font><br />
***** Displaying PVFS2 Configuration Information *****<br />
------------------------------------------------------<br />
PVFS2 configured to build karma gui               :  no<br />
PVFS2 configured to use epoll                     : yes<br />
PVFS2 configured to perform coverage analysis     :  no<br />
PVFS2 configured for aio threaded callbacks       : yes<br />
PVFS2 configured for the 2.6.x kernel module      : yes<br />
PVFS2 configured for the 2.4.x kernel module      :  no<br />
PVFS2 configured for using the mmap-ra-cache      :  no<br />
PVFS2 configured for using trusted connections    :  no<br />
PVFS2 configured for a thread-safe client library : yes<br />
PVFS2 will use workaround for redhat 2.4 kernels  :  no<br />
PVFS2 will use workaround for buggy NPTL          :  no<br />
PVFS2 server will be built                        : yes<br />
<font color="Blue">###############################################</font><br />
<br />
Not sure if not beeing able to build any karma at all is a good thing or not, we will see ;)<br />
<br />
And then, you just need something like:<br />
<br />
<font color="Blue">--------------------------------------------------------</font><br />
make all kmod <br />
<font color="Blue">--------------------------------------------------------</font><br />
<br />
<br />
The next step is to recompile Open MPI with support for PVFS.<br />
<br />
Second surprise, Open MPI just blows up with the following information:<br />
<br />
<font color="Blue">###############################################</font><br />
io_romio_ad_pvfs2_open.c: In function 'fake_an_open':<br />
io_romio_ad_pvfs2_open.c:86: warning: passing argument 6 of 'PVFS_sys_create' from incompatible pointer type<br />
<b>io_romio_ad_pvfs2_open.c:86: error: too few arguments to function 'PVFS_sys_create'</b><br />
make[5]: *** [io_romio_ad_pvfs2_open.lo] Error 1<br />
make[5]: Leaving directory `/home/mbozzore/trunk/src/kits/platform_hpc/packages/openmpi-interconnects-gnu/openmpi-1.2.4/ompi/mca/io/romio/romio/adio/ad_pvfs2'<br />
make[4]: *** [all-recursive] Error 1<br />
<font color="Blue">###############################################</font><br />
<br />
Nice, the reference is here:<br />
<br />
<a href="http://www.open-mpi.org/community/lists/devel/2008/05/4071.php" target="_self">Open MPI Development Mailing List Archives</a><br />
<br />
So, just step back and try another version, like PVFS 2.6.3, and BerkeleyDB 4.6.21<br />
<br />
Note: I have a &quot;non standard&quot; config. My workstation is an OCS 5 installer and my nodes are VMs. The following is also exported (from installer to compute nodes), with the no_root_squash option : /home ; /VM/mbozzore<br />
<br />
Of course, if you go through the same process, you will experience the fact that PVFS 2.7.x and 2.6.x are slightly different (for example, 2 arguments are needed for pvfs2-genconfig (only one for version 2.7.1))<br />
<br />
<br />
For the config part, I used:<br />
<br />
<font color="Blue">--------------------------------------------------------</font><br />
[mbozzore@stakhanov pvfs-2.6.3]$ <b>./configure --prefix=/home/mbozzore/pvfs-2.6.3 --with-kernel=/usr/src/kernels/2.6.18-53.el5-x86_64/ --enable-shared --with-db=/home/mbozzore/db-4.6.21/</b><br />
<font color="Blue">--------------------------------------------------------</font><br />
<br />
And again:<br />
<br />
<font color="Blue">###############################################</font><br />
***** Displaying PVFS2 Configuration Information *****<br />
------------------------------------------------------<br />
PVFS2 configured to build karma gui               :  no<br />
PVFS2 configured to use epoll                     : yes<br />
PVFS2 configured to perform coverage analysis     :  no<br />
PVFS2 configured for aio threaded callbacks       : yes<br />
PVFS2 configured for the 2.6.x kernel module      : yes<br />
PVFS2 configured for the 2.4.x kernel module      :  no<br />
PVFS2 configured for using the mmap-ra-cache      :  no<br />
PVFS2 configured for using trusted connections    :  no<br />
PVFS2 configured for a thread-safe client library : yes<br />
PVFS2 will use workaround for redhat 2.4 kernels  :  no<br />
PVFS2 will use workaround for buggy NPTL          :  no<br />
PVFS2 server will be built                        : yes<br />
<font color="Blue">###############################################</font><br />
<br />
As usual:  <br />
<br />
<font color="Blue">--------------------------------------------------------</font><br />
make <br />
make kmod<br />
make install <br />
<font color="Blue">--------------------------------------------------------</font><br />
<br />
Note that by default make install will install pvfs2-server in the sbin directory, but no client at all (actually you need 2 binaries for the client part).<br />
<br />
On the PVFS side, I'd like to use the following configuration:<br />
<br />
<font color="Blue">--------------------------------------------------------</font><br />
compute-00-00, compute-00-01 : IO nodes<br />
compute-00-02 : metadata server<br />
compute-00-03 : client only (compute-00-00, compute-00-01, compute-00-02 also clients)<br />
<font color="Blue">--------------------------------------------------------</font><br />
<br />
<br />
In order to achieve that, I need to generate the config files, using the pvfs2-genconfig script; it can be used interactively:<br />
<br />
<font color="Blue">###############################################</font><br />
[root@stakhanov ~]# <b>/home/mbozzore/pvfs-2.6.3/bin/pvfs2-genconfig /etc/pvfs2/pvfs2-fs.conf /etc/pvfs2/pvfs2-server.conf</b><br />
**************************************************  ********************<br />
    Welcome to the PVFS2 Configuration Generator:<br />
<br />
This interactive script will generate configuration files suitable<br />
for use with a new PVFS2 file system.  Please see the PVFS2 quickstart<br />
guide for details.<br />
<br />
**************************************************  ********************<br />
You must first select the network protocol that your file system will use.<br />
The only currently supported options are &quot;tcp&quot;, &quot;gm&quot;, and &quot;ib&quot;.<br />
(For multi-homed configurations, use e.g. &quot;ib,tcp&quot;.)<br />
<br />
* Enter protocol type [Default is tcp]:<br />
<br />
Choose a TCP/IP port for the servers to listen on.  Note that this<br />
script assumes that all servers will use the same port number.<br />
<br />
* Enter port number [Default is 3334]:<br />
<br />
Choose a directory for each server to store data in.<br />
<br />
* Enter directory name: [Default is /pvfs2-storage-space]:<br />
<br />
Choose a file for each server to write log messages to.<br />
<br />
* Enter log file location [Default is /tmp/pvfs2-server.log]: /var/log/pvfs2-server.log<br />
<br />
Next you must list the hostnames of the machines that will act as<br />
I/O servers.  Acceptable syntax is &quot;node1, node2, ...&quot; or &quot;node{#-#,#,#}&quot;.<br />
<br />
* Enter hostnames [Default is localhost]: compute-00-00, compute-00-01<br />
<br />
Now list the hostnames of the machines that will act as Metadata<br />
servers.  This list may or may not overlap with the I/O server list.<br />
<br />
* Enter hostnames [Default is localhost]: compute-00-02<br />
<br />
Configured a total of 3 servers:<br />
2 of them are I/O servers.<br />
1 of them are Metadata servers.<br />
<br />
* Would you like to verify server list (y/n) [Default is n]? y<br />
<br />
****** I/O servers:<br />
compute-00-01<br />
compute-00-00<br />
<br />
****** Metadata servers:<br />
compute-00-02<br />
<br />
* Does this look ok (y/n) [Default is y]?<br />
<br />
Writing fs config file... Done.<br />
Writing 3 server config file(s)... Done.<br />
<font color="Blue">###############################################</font><br />
<br />
So, the good news is : multi-homed configs are supported:<br />
<br />
<font color="Blue">###############################################</font><br />
You must first select the network protocol that your file system will use.<br />
The only currently supported options are &quot;tcp&quot;, &quot;gm&quot;, and &quot;ib&quot;.<br />
<b>(For multi-homed configurations, use e.g. &quot;ib,tcp&quot;.)</b><br />
<font color="Blue">###############################################</font><br />
<br />
Looks great, will try it later (IB).<br />
<br />
The script did generate few files (you will not get the same number of files with PVFS 2.7.1)<br />
<br />
The files generated are:<br />
<br />
<font color="Blue">--------------------------------------------------------</font><br />
[root@stakhanov ~]# <b>ls /etc/pvfs2/</b><br />
pvfs2-fs.conf  pvfs2-server.conf-compute-00-00  pvfs2-server.conf-compute-00-01  pvfs2-server.conf-compute-00-02<br />
<font color="Blue">--------------------------------------------------------</font><br />
<br />
Basically, one main config file and one file per server (IO or Metadata)<br />
<br />
So, what's inside ?<br />
<br />
<br />
<font color="Blue">###############################################</font><br />
[root@stakhanov ~]# <b>cat /etc/pvfs2/pvfs2-fs.conf</b><br />
&lt;Defaults&gt;<br />
        UnexpectedRequests 50<br />
        EventLogging none<br />
        LogStamp datetime<br />
        BMIModules bmi_tcp<br />
        FlowModules flowproto_multiqueue<br />
        PerfUpdateInterval 1000<br />
        ServerJobBMITimeoutSecs 30<br />
        ServerJobFlowTimeoutSecs 30<br />
        ClientJobBMITimeoutSecs 300<br />
        ClientJobFlowTimeoutSecs 300<br />
        ClientRetryLimit 5<br />
        ClientRetryDelayMilliSecs 2000<br />
&lt;/Defaults&gt;<br />
<br />
&lt;Aliases&gt;<br />
        Alias compute-00-00 tcp://compute-00-00:3334<br />
        Alias compute-00-01 tcp://compute-00-01:3334<br />
        Alias compute-00-02 tcp://compute-00-02:3334<br />
&lt;/Aliases&gt;<br />
<br />
&lt;Filesystem&gt;<br />
        Name pvfs2-fs<br />
        ID 398972398<br />
        RootHandle 1048576<br />
        &lt;MetaHandleRanges&gt;<br />
                Range compute-00-02 4-1431655767<br />
        &lt;/MetaHandleRanges&gt;<br />
        &lt;DataHandleRanges&gt;<br />
                Range compute-00-00 1431655768-2863311531<br />
                Range compute-00-01 2863311532-4294967295<br />
        &lt;/DataHandleRanges&gt;<br />
        &lt;StorageHints&gt;<br />
                TroveSyncMeta yes<br />
                TroveSyncData no<br />
        &lt;/StorageHints&gt;<br />
&lt;/Filesystem&gt;<br />
[root@stakhanov ~]# <b>cat /etc/pvfs2/pvfs2-server.conf-compute-00-00</b><br />
StorageSpace /pvfs2-storage-space<br />
HostID &quot;tcp://compute-00-00:3334&quot;<br />
LogFile /var/log/pvfs2-server.log<br />
[root@stakhanov ~]# <b>cat /etc/pvfs2/pvfs2-server.conf-compute-00-01</b><br />
StorageSpace /pvfs2-storage-space<br />
HostID &quot;tcp://compute-00-01:3334&quot;<br />
LogFile /var/log/pvfs2-server.log<br />
[root@stakhanov ~]# <b>cat /etc/pvfs2/pvfs2-server.conf-compute-00-02</b><br />
StorageSpace /pvfs2-storage-space<br />
HostID &quot;tcp://compute-00-02:3334&quot;<br />
LogFile /var/log/pvfs2-server.log<br />
<font color="Blue">###############################################</font><br />
<br />
Wow, I just got the following when trying to get a preview of my post:<br />
<br />
The following errors occurred with your submission:  <br />
The text that you have entered is too long (20231 characters). Please shorten it to 20000 characters long.<br />
<br />
I knew it: no karma at all ;)<br />
<br />
Mehdi Bozzo-Rey</blockquote>

 ]]></content:encoded>
			<dc:creator>mehdi</dc:creator>
			<guid isPermaLink="true">http://www.hpccommunity.org/blogs/mehdi/pvfs-version-2-first-try-85/</guid>
		</item>
		<item>
			<title>LAVA, Open MPI, Infiniband (OFED) and ... RLIMIT_MEMLOCK</title>
			<link>http://www.hpccommunity.org/blogs/mehdi/lava-open-mpi-infiniband-ofed-rlimit_memlock-86/</link>
			<pubDate>Thu, 21 Aug 2008 04:58:42 GMT</pubDate>
			<description>When submitting an openmpi job through lava on a cluster that is IB enabled (OFED), you will probably see this kind of error: 
...</description>
			<content:encoded><![CDATA[<blockquote class="blogcontent restore">When submitting an openmpi job through lava on a cluster that is IB enabled (OFED), you will probably see this kind of error:<br />
<br />
<font color="Blue">##################################################</font><br />
[compute-0-0.local:07337] mca_mpool_openib_register: ibv_reg_mr(0x1711000,528384) <b>failed with error: Cannot allocate memory</b>[compute-0-0.local:07337] mca_mpool_openib_register: ibv_reg_mr(0x1711000,528384) failed with error: Cannot allocate memory<br />
[0,1,9][btl_openib.c:808:mca_btl_openib_create_cq_srq] <b>error creating low priority cq for mthca0 errno says Cannot allocate memory*</b><br />
--------------------------------------------------------------------------<br />
It looks like MPI_INIT failed for some reason; your parallel process is<br />
likely to abort.* There are many reasons that a parallel process can<br />
fail during MPI_INIT; some of which are due to configuration or environment<br />
problems.* This failure appears to be an internal failure; here's some<br />
additional information (which may only be relevant to an Open MPI<br />
developer):<br />
*<br />
* PML add procs failed<br />
* --&gt; Returned &quot;Error&quot; (-1) instead of &quot;Success&quot; (0)<br />
--------------------------------------------------------------------------<br />
*** An error occurred in MPI_Init<br />
*** before MPI was initialized<br />
*** MPI_ERRORS_ARE_FATAL (goodbye)<br />
[0,1,0][btl_openib.c:808:mca_btl_openib_create_cq_srq] error creating low priority cq for mthca0 errno says Cannot allocate memory<br />
<font color="Blue">##################################################</font><br />
<br />
<br />
<br />
This one is a little bit more explicit:<br />
<br />
<br />
<font color="Blue">##################################################</font><br />
Your job looked like:<br />
<br />
------------------------------------------------------------<br />
# LSBATCH: User input<br />
openmpi-mpirun -np 8 ./hello<br />
------------------------------------------------------------<br />
<br />
Exited with exit code 143.<br />
<br />
Resource usage summary:<br />
<br />
    CPU time   :      0.08 sec.<br />
<br />
The output (if any) follows:<br />
<br />
<b>libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.<br />
    This will severely limit memory registrations.</b><br />
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.<br />
    This will severely limit memory registrations.<br />
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.<br />
    This will severely limit memory registrations.<br />
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.<br />
    This will severely limit memory registrations.<br />
--------------------------------------------------------------------------<br />
<b>The OpenIB BTL failed to initialize while trying to allocate some<br />
locked memory.  This typically can indicate that the memlock limits<br />
are set too low.  For most HPC installations, the memlock limits<br />
should be set to &quot;unlimited&quot;. </b> The failure occured here:<br />
<br />
    Host:          compute-00-00<br />
    OMPI source:   btl_openib.c:828<br />
    Function:      ibv_create_cq()<br />
    Device:        mthca0<br />
    Memlock limit: 32768<br />
<br />
<b>You may need to consult with your system administrator to get this<br />
problem fixed.  This FAQ entry on the Open MPI web site may also be<br />
helpful:<br />
<br />
    <a href="http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages" target="_self">FAQ: Tuning the run-time characterisitics of MPI OpenFabrics communications (InfiniBand and iWARP)</a></b><br />
--------------------------------------------------------------------------<br />
It looks like MPI_INIT failed for some reason; your parallel process is<br />
likely to abort.  There are many reasons that a parallel process can<br />
fail during MPI_INIT; some of which are due to configuration or environment<br />
problems.  This failure appears to be an internal failure; here's some<br />
additional information (which may only be relevant to an Open MPI<br />
developer):<br />
<br />
  PML add procs failed<br />
  --&gt; Returned &quot;Error&quot; (-1) instead of &quot;Success&quot; (0)<br />
--------------------------------------------------------------------------<br />
*** An error occurred in MPI_Init<br />
*** before MPI was initialized<br />
*** MPI_ERRORS_ARE_FATAL (goodbye)<br />
--------------------------------------------------------------------------<br />
The OpenIB BTL failed to initialize while trying to allocate some<br />
locked memory.  This typically can indicate that the memlock limits<br />
are set too low.  For most HPC installations, the memlock limits<br />
should be set to &quot;unlimited&quot;.  The failure occured here:<br />
<font color="Blue">##################################################</font><br />
<br />
<br />
<br />
Looks easy to fix, from the Open MPI FAQ.<br />
<br />
Let's check what are the limits on the nodes:<br />
<br />
<font color="Blue">---------------------------------------------------------</font><br />
[mbozzore@tyan04 basic]$ <b>ssh -x compute-00-00</b><br />
Last login: Mon Aug 11 08:03:13 2008 from tyan04.ocs5.org<br />
[mbozzore@compute-00-00 ~]$ <b>ulimit -a</b><br />
core file size          (blocks, -c) 0<br />
data seg size           (kbytes, -d) unlimited<br />
scheduling priority             (-e) 0<br />
file size               (blocks, -f) unlimited<br />
pending signals                 (-i) 8184<br />
<b><font color="Red">max locked memory       (kbytes, -l) 1026028</font></b><br />
max memory size         (kbytes, -m) unlimited<br />
open files                      (-n) 1024<br />
pipe size            (512 bytes, -p) 8<br />
POSIX message queues     (bytes, -q) 819200<br />
real-time priority              (-r) 0<br />
stack size              (kbytes, -s) 10240<br />
cpu time               (seconds, -t) unlimited<br />
max user processes              (-u) 8184<br />
virtual memory          (kbytes, -v) unlimited<br />
file locks                      (-x) unlimited<br />
<font color="Blue">---------------------------------------------------------</font><br />
<br />
Looks like this is not the problem, so the next step is to start the same job on the same nodes, but outside of lava:<br />
<br />
<font color="Blue">---------------------------------------------------------</font><br />
[mbozzore@tyan04 basic]$ <b>mpirun -np 8 --machinefile ./hosts --prefix $MPIHOME ./hello</b><br />
Hello, world, I am 0 of 8<br />
Hello, world, I am 1 of 8<br />
Hello, world, I am 2 of 8<br />
Hello, world, I am 5 of 8<br />
Hello, world, I am 3 of 8<br />
Hello, world, I am 4 of 8<br />
Hello, world, I am 7 of 8<br />
Hello, world, I am 6 of 8<br />
<font color="Blue">---------------------------------------------------------</font><br />
<br />
Hmmm ... does not look good; what did I miss ?<br />
<br />
I can ... try to force the use of IB, outside of LAVA:<br />
<br />
<font color="Blue">---------------------------------------------------------</font><br />
[mbozzore@tyan04 basic]$ <b>mpirun -np 8 --machinefile ./hosts --prefix $MPIHOME --mca btl openib,self ./hello</b><br />
Hello, world, I am 0 of 8<br />
Hello, world, I am 1 of 8<br />
Hello, world, I am 2 of 8<br />
Hello, world, I am 7 of 8<br />
Hello, world, I am 4 of 8<br />
Hello, world, I am 3 of 8<br />
Hello, world, I am 6 of 8<br />
Hello, world, I am 5 of 8<br />
<font color="Blue">---------------------------------------------------------</font><br />
<br />
And force the use of tcp when running under lava:<br />
<br />
<font color="Blue">---------------------------------------------------------</font><br />
[mbozzore@tyan04 basic]$ <b>bsub -o%J.out -n 8 openmpi-mpirun -np 8 --mca btl tcp,self ./hello</b><br />
<font color="Blue">---------------------------------------------------------</font><br />
<br />
And the job output will be something like:<br />
<br />
<br />
<font color="Blue">##################################################</font><br />
Your job looked like:<br />
<br />
------------------------------------------------------------<br />
# LSBATCH: User input<br />
openmpi-mpirun -np 8 --mca btl tcp,self ./hello<br />
------------------------------------------------------------<br />
<br />
Successfully completed.<br />
<br />
Resource usage summary:<br />
<br />
    CPU time   :      0.08 sec.<br />
<br />
The output (if any) follows:<br />
<br />
Hello, world, I am 0 of 8<br />
Hello, world, I am 1 of 8<br />
Hello, world, I am 2 of 8<br />
Hello, world, I am 3 of 8<br />
Hello, world, I am 4 of 8<br />
Hello, world, I am 5 of 8<br />
Hello, world, I am 6 of 8<br />
Hello, world, I am 7 of 8<br />
<br />
<font color="Blue">##################################################</font><br />
<br />
<br />
Well, this looks very strange ... let's try to check the limits again, but this time through lava:<br />
<br />
<font color="Blue">---------------------------------------------------------</font><br />
[mbozzore@tyan04 basic]$ <b>bsub -Ip -m compute-00-00 bash</b><br />
Job &lt;626&gt; is submitted to default queue &lt;normal&gt;.<br />
&lt;&lt;Waiting for dispatch ...&gt;&gt;<br />
&lt;&lt;Starting on compute-00-00&gt;&gt;<br />
[mbozzore@compute-00-00 basic]$ ulimit -a<br />
core file size          (blocks, -c) unlimited<br />
data seg size           (kbytes, -d) unlimited<br />
scheduling priority             (-e) 0<br />
file size               (blocks, -f) unlimited<br />
pending signals                 (-i) 8184<br />
<b><font color="Red">max locked memory       (kbytes, -l) 32</font></b><br />
max memory size         (kbytes, -m) unlimited<br />
open files                      (-n) 1024<br />
pipe size            (512 bytes, -p) 8<br />
POSIX message queues     (bytes, -q) 819200<br />
real-time priority              (-r) 0<br />
stack size              (kbytes, -s) unlimited<br />
cpu time               (seconds, -t) unlimited<br />
max user processes              (-u) 8184<br />
virtual memory          (kbytes, -v) unlimited<br />
file locks                      (-x) unlimited<br />
<font color="Blue">---------------------------------------------------------</font><br />
<br />
Am I getting crazy ???? The same node, but different limits inside/outside of LAVA ...<br />
<br />
Actually, no (at least not yet) and getting differents values for the limits (through LAVA / ssh shell) is absolutely normal. The answer is in the init scripts / default limits at boot / init time. <br />
<br />
For the limits, Linux provides several resources limits and one of them is RLIMIT_MEMLOCK (maximum number of bytes of memory a process can lock into memory via mlock(), mlckall() or shmctl()). The default soft and hard resources for RLIMIT_MEMLOCK are 8 pages.<br />
<br />
Let's check in the source code (Linux kernel). For example, the <i>sys/resource.h</i> header file includes <i>bits/resource.h</i> and from this header file : <br />
<br />
<br />
<br />
<div class="bbcode_container">
	<div class="bbcode_description">Code:</div>
	<pre class="bbcode_code" style="height:372px;">/* Transmute defines to enumerations.  The macro re-definitions are
   necessary because some programs want to test for operating system
   features with #ifdef RUSAGE_SELF.  In ISO C the reflexive
   definition is a no-op.  */
 
/* Kinds of resource limit.  */
enum __rlimit_resource
{
  /* Per-process CPU limit, in seconds.  */
  RLIMIT_CPU = 0,
#define RLIMIT_CPU RLIMIT_CPU
 
  /* Largest file that can be created, in bytes.  */
  RLIMIT_FSIZE = 1,
#define RLIMIT_FSIZE RLIMIT_FSIZE
 
  /* Maximum size of data segment, in bytes.  */
  RLIMIT_DATA = 2,
#define RLIMIT_DATA RLIMIT_DATA
 
  /* Maximum size of stack segment, in bytes.  */
  RLIMIT_STACK = 3,
#define RLIMIT_STACK RLIMIT_STACK
 
  /* Largest core file that can be created, in bytes.  */
  RLIMIT_CORE = 4,
#define RLIMIT_CORE RLIMIT_CORE
 
  /* Largest resident set size, in bytes.
     This affects swapping; processes that are exceeding their
     resident set size will be more likely to have physical memory
     taken from them.  */
  __RLIMIT_RSS = 5,
#define RLIMIT_RSS __RLIMIT_RSS
 
  /* Number of open files.  */
  RLIMIT_NOFILE = 7,
  __RLIMIT_OFILE = RLIMIT_NOFILE, /* BSD name for same.  */
#define RLIMIT_NOFILE RLIMIT_NOFILE
#define RLIMIT_OFILE __RLIMIT_OFILE
 
  /* Address space limit.  */
  RLIMIT_AS = 9,
#define RLIMIT_AS RLIMIT_AS
 
  /* Number of processes.  */
  __RLIMIT_NPROC = 6,
#define RLIMIT_NPROC __RLIMIT_NPROC
 
 <b> /* Locked-in-memory address space.  */
  __RLIMIT_MEMLOCK = 8,
#define RLIMIT_MEMLOCK __RLIMIT_MEMLOCK</b>
 
  /* Maximum number of file locks.  */
  __RLIMIT_LOCKS = 10,
#define RLIMIT_LOCKS __RLIMIT_LOCKS
 
  /* Maximum number of pending signals.  */
  __RLIMIT_SIGPENDING = 11,
#define RLIMIT_SIGPENDING __RLIMIT_SIGPENDING</pre>
</div> Hmmmm ... 8 ... 8 what ? ... oh yes, 8 pages. Let's check the page size then.<br />
<br />
I love man pages : <b>man getpagesize</b><br />
<br />
So, just create test.c :<br />
<br />
<br />
<div class="bbcode_container">
	<div class="bbcode_description">Code:</div>
	<pre class="bbcode_code" style="height:132px;">#include &lt;stdio.h&gt;
#include &lt;unistd.h&gt;

int main()
{
int page_size;
page_size=getpagesize ();
printf(&quot;page size=%ld\n&quot;,page_size);
}</pre>
</div> And then <b>gcc test.c; ./a.out </b>:<br />
<br />
<font color="Blue">---------------------------------------------------------</font><br />
[root@stakhanov conf]# ./a.out<br />
page size=4096<br />
<font color="Blue">---------------------------------------------------------</font><br />
<br />
Cool, this is also consistent with what you can find here: <i>/usr/include/linux/resource.h</i><br />
<br />
<br />
<br />
<br />
<div class="bbcode_container">
	<div class="bbcode_description">Code:</div>
	<pre class="bbcode_code" style="height:372px;">#ifndef _LINUX_RESOURCE_H
#define _LINUX_RESOURCE_H
 
#include &lt;linux/time.h&gt;
 
/*
 * Resource control/accounting header file for linux
 */
 
/*
 * Definition of struct rusage taken from BSD 4.3 Reno
 *
 * We don't support all of these yet, but we might as well have them....
 * Otherwise, each time we add new items, programs which depend on this
 * structure will lose.  This reduces the chances of that happening.
 */
...
...
...
<b>/*
 * GPG wants 32kB of mlocked memory, to make sure pass phrases
 * and other sensitive information are never written to disk.
 */
#define MLOCK_LIMIT     (8 * PAGE_SIZE)</b>
 
/*
 * Due to binary compatibility, the actual resource numbers
 * may be different for different linux versions..
 */</pre>
</div> Nice, but still, I am getting different limits inside and outside of lava ... plus the fact that I get the <b>default limit _only_ inside lava</b>.<br />
<br />
Ok, what can change these limits ? <br />
<br />
Well, ulimit ... and you can set up limits many different ways:<br />
<br />
<i>/etc/profile</i> for example sets up the following:<br />
<br />
<font color="Blue">---------------------------------------------------------</font><br />
# No core files by default<br />
ulimit -S -c 0 &gt; /dev/null 2&gt;&amp;1<br />
<font color="Blue">---------------------------------------------------------</font><br />
<br />
There is also some setup done in <i>/etc/security/limits.conf</i><br />
<br />
<br />
Another very interesting thing, and the key here is that: <br />
<br />
<ul><li>Any process is free to increase a soft limit to any value from 0 to the hard limit, or to decrease a hardlimit. <b><font color="Red">Children will inherit these updated limits during a fork.</font></b></li>
<li>A privileged process is free to set a hard limit to any value. <font color="red"><b>Children will inherit these updated limits during a fork.</b></font></li>
</ul> <br />
This is exactly the problem. Long story short:  the lava sbatchd will fork childs;  your mpi instance is &quot;under&quot; it and  will inherit the sbatchd memlock limit. <br />
<br />
So :<ul><li>the init script (runlevel 3) will start the LAVA daemons with the default system limits (32k for memlock)</li>
<li>these daemons will fork childs; your job is a forked process so it will inherit the sbatchd memlock limit</li>
</ul> <br />
Let's check this out: just submit a sleep job and check on the node what is going on:<br />
<br />
<font color="Blue">---------------------------------------------------------</font><br />
[mbozzore@stakhanov ~]$ <b>bsub sleep 3000</b><br />
Job &lt;212&gt; is submitted to default queue &lt;normal&gt;.<br />
<br />
[mbozzore@stakhanov ~]$ <b>bjobs -w</b><br />
JOBID   USER    STAT  QUEUE      FROM_HOST   EXEC_HOST   JOB_NAME   SUBMIT_TIME<br />
212     mbozzore RUN   normal     stakhanov   <b>compute-00-00</b> sleep 3000 Aug 21 00:41<br />
<br />
[mbozzore@stakhanov ~]$ <b>ssh -x compute-00-00</b><br />
Last login: Tue Aug 26 19:43:54 2008 from stakhanov.ocs5<br />
<br />
[mbozzore@compute-00-00 ~]$ <b>ps -ef | grep sleep</b><br />
mbozzore <b>14090 14089  </b>0 19:49 ?        00:00:00 sleep 3000<br />
mbozzore 14129 14094  0 19:49 pts/3    00:00:00 grep sleep<br />
<br />
[mbozzore@compute-00-00 ~]$ <b>ps -opid,ppid,comm,args 14089</b>  <br />
PID  PPID COMMAND         COMMAND<br />
<b>14089 14088</b> 1219293702.212  /bin/sh /home/mbozzore/.lsbatch/1219293702.212<br />
<br />
[mbozzore@compute-00-00 ~]$ <b>ps -opid,ppid,comm,args 14088</b><br />
PID  PPID COMMAND         COMMAND<br />
<b>14088 30427</b> res             /usr/sbin/res -d /etc/lava/conf -m stakhanov /home/mbozzore/.lsbatch/1219293702.212<br />
<br />
[mbozzore@compute-00-00 ~]$ <b>ps -opid,ppid,comm,args 30427</b>  PID  PPID COMMAND         COMMAND<br />
<b>30427     1</b> sbatchd         /usr/sbin/sbatchd<br />
[mbozzore@compute-00-00 ~]$<br />
<font color="Blue">---------------------------------------------------------</font><br />
<br />
<ul><li>if you just ssh to one compute node, you will see that the output of ulimit for memlock is set to unlimited</li>
<li>if you submit an interactive job (bsub -Ip bash for example), then the output of ulimit -l will be 32k, even if the default shell limit for memlock is unlimited. This is just because the original sbatchd had a 32k limit for memlock</li>
</ul> <br />
And of course, this is also why restarting lava will solve the problem (full service stop / start): as soon as you log on one node (ssh), you will open a shell and the default memlock limit will be unlimited so the daemons started from this shell will inherit this limit. <br />
<br />
The key is to modify the lava init script (just insert a ulimit before starting the daemons).<br />
<br />
For reference, I found a lot of useful information reading this book:<br />
<br />
Linux System Programming <br />
by Robert Love <br />
Publisher: O'Reilly <br />
Pub Date: September 15, 2007 <br />
Print ISBN-10: 0-596-00958-5 <br />
Print ISBN-13: 978-0-59-600958-8<br />
 <br />
It is available through safari  books online (<a href="http://safari.oreilly.com/0596009585/resource_limits" target="_self">O'Reilly - Safari Books Online - 0596009585 - Linux System Programming, 1st Edition</a>)<br />
<br />
<br />
Mehdi Bozzo-Rey</blockquote>

 ]]></content:encoded>
			<dc:creator>mehdi</dc:creator>
			<guid isPermaLink="true">http://www.hpccommunity.org/blogs/mehdi/lava-open-mpi-infiniband-ofed-rlimit_memlock-86/</guid>
		</item>
		<item>
			<title>Use of Server Virtualization in HPC Environments</title>
			<link>http://www.hpccommunity.org/blogs/khalid/use-server-virtualization-hpc-environments-84/</link>
			<pubDate>Fri, 15 Aug 2008 17:12:00 GMT</pubDate>
			<description>In this blog, I will outline characteristics of HPC environments and how server virtualization technologies can address some issues, along with some...</description>
			<content:encoded><![CDATA[<blockquote class="blogcontent restore">In this blog, I will outline characteristics of HPC environments and how server virtualization technologies can address some issues, along with some of what I see might be challenges in using these technologies. This blog assumes you have some familiarity with the basics of server virtualization. For a quick overview see <a href="http://en.wikipedia.org/wiki/Hypervisor" target="_self">Hypervisor - Wikipedia, the free encyclopedia</a>.<br />
<br />
The typical HPC environment today is characterized by “cluster silos” in which different departments or application groups set up and configure workload clusters  to suit their needs. Because different applications require different software stacks including OS and system configuration, job schedulers, or middleware like MPI or PVM, each environment is relatively unique. Some larger organizations have managed to standardize different applications on one stack and consolidated multiple clusters into a single large central Grid managed by IT.  However, for the majority of organizations where a more de-centralized structure is the norm, there is concern about relatively low utilization of individual cluster silos and the management overhead of associated with maintaining different physical infrastructures for each application or department.<br />
<br />
One of the hot trends in the broader Enterprise IT landscape has been the rise of server virtualization technologies. While server virtualization has existed in the mainframe and most UNIX environments for a while, companies like VMware  blazed the trail on the commodity X86 hardware platforms. The ability to carve up a physical machine into multiple logical machines in a manner transparent to the OS or application, enables the consolidation of  lightly used enterprise servers for mail, print, web, etc onto fewer physical machines thus costs of physical servers as well as power, cooling and management costs. <br />
<br />
For a long time VMware was the sole viable option for server virtualization on X86 hardware, but now there are a number of alternative choices including Microsoft Hyper-V, Xen, and KVM recently endorsed by RedHat. This trend to the commoditization of basic server virtualization functionality can be seen by the lower costs associated with this technology which have gone down from $1000s of dollars per node to free or nearly free. Obviously a high cost model does not work when deployed on the large scale distributed environments typical of HPC. <br />
<br />
With the availability of commodity server virtualization software, often baked into the OS,  here are some of the possible applications of it within HPC environments:<br />
<br />
<b>Checkpoint/Restart/Migration:</b> It has always been a challenge of how to deal with long-running jobs that need to be periodically checkpointed to avoid losing work in the event of failures.  Typically this relied on the application  to preserve its state or the use of expensive hardware where the OS supported process-level checkpoint/restart. Today, most hypervisors on X86 have the ability to take a snapshot of the memory and disk state of a running VM and park it to disk and later restart that process on another physical machine. This provides a clean mechanism that is non-intrusive to the  application. Some hypervisors even provide the ability to do live migration meaning that the VM is moved to another physical host without even impacting the network connections, which would be useful in MPI-style applications.<br />
<br />
<b>Dynamic Provisioning &amp; Sharing:</b> It is relatively easy to capture all the application environment and settings into a VM image which can be started in a few minutes or even seconds. This allows for dynamically creating an environment or rapidly adding resources to scale out the cluster as the workload demand increases or shutting down machines when it drops. Theoretically each long running job can be encapsulated in its own VM or a set of VMs created for a parallel job and then shut down when no longer needed. Machines become disposable in a virtual environment because all the important settings for an application are captured in the image which is maintained on disk.<br />
<br />
<b>Application Isolation &amp; Security:</b> Another advantage of VMs is that they can isolate applications belonging to different departments or potentially different organizations. Running jobs from different organizations on the same physical machine can lead to issues of data privacy, and reliability if one misbehaving application causes the OS to <br />
<br />
<b>Dev/Test Clusters:</b> When new applications are being developed or a new version of a commercial application is being tested, it is necessary to set up a new cluster for short periods of time. Rather than physically setup a separate cluster, the use of server virtualization allows to create an entire virtual cluster for developers/QA and when it is no longer used to shut it down to free up resources for production use.<br />
<br />
<b>Power Management/ Green IT:</b> In a virtualized environment VM machines can be migrated amongst physical machines. If several VMs are not fully utilizing the physical capacity, then they can be migrated onto a smaller set of physical hosts to improve utilization and then the physical machines can be powered off , thereby saving power and cooling costs. <br />
<br />
While there are many potential uses and benefits to server virtualization technology in HPC, there are also some challenges:<br />
<br />
<b>Scaleable Storage backend:</b> Given that machines are now transformed into files on disk, this places more strain on the storage system backing a compute farm. Some features,  like live migration work,  best if there is a centralized shared storage infrastructure. The cost of scaling up the storage backend across 100s or 1000s of nodes may outweigh the savings costs on the compute hardware. One option is to make use of local disks to store images and switch between them. But then, the issue of updating or patching local copies of an image will arise. Some sort of image distribution mechanism will be required.<br />
<br />
<b>IO overhead of VM:</b> VM technologies still tend to have some overhead compared to running on raw physical hardware. This is especially true for the case of I/O intensive or latency sensitive applications. One approach might be only run one VM per physical machine to give maximum access to disk and network drivers to the single application while still taking advantage of the flexibility of dynamic provisioniong with VM.  Hypervisor support for specialized  communication transports such as Infiniband/Quadrics/Myrinet is another related issue. <br />
<br />
<b>VM Management:</b> Management of VM environments becomes another challenge because now you are no longer dealing with just the physical boxes and a single OS instance running on them, but potentially a several OS instances. Patching, updating images, monitoring the OS instances, troubleshooting and diagnostics, policy-based resource allocation all become further complicated in a VM environment. Without an appropriate set of tools and procedures in places, this could obviate the benefits. The hypervisor vendors provide their own tools to address these issues in the context of their own technology. Tools such as Platform VMO are attempting to address the heterogeneous management challenge, but there is still scope for improvement and dealing with the unique requirements of HPC environments.<br />
<br />
<b>Application Performance benchmarking: </b> Given that server virtualization is has not been that widely adopted within HPC yet, there aren’t that many ISVs or providers of libraries and tools like LINPACK that have done benchmarking of their software on hypervisors. This is a chicken-and-egg type of scenario where users have to push the vendors in order for the vendors to see the market demand.<br />
<br />
So likely any technology, one has to weigh the costs and benefits of introducing it into an existing environment. Server virtualizations will be broadly used within the generic IT landscape over the coming years, so it would be prudent of HPC users to take a look and see how it can help. I would be interested in hearing peoples thoughts of whether you are looking at server virtualization, which technologies you are considering and which use cases you are targeting.</blockquote>

 ]]></content:encoded>
			<dc:creator>Khalid</dc:creator>
			<guid isPermaLink="true">http://www.hpccommunity.org/blogs/khalid/use-server-virtualization-hpc-environments-84/</guid>
		</item>
		<item>
			<title>Exploring HPC Programming: Where to start</title>
			<link>http://www.hpccommunity.org/blogs/bearcat/exploring-hpc-programming-where-start-83/</link>
			<pubDate>Mon, 11 Aug 2008 12:40:24 GMT</pubDate>
			<description>One of the topics I want to cover here, is HPC programming. That includes many things, so I want to look at such things as threading, toolkits such...</description>
			<content:encoded><![CDATA[<blockquote class="blogcontent restore">One of the topics I want to cover here, is HPC programming. That includes many things, so I want to look at such things as threading, toolkits such as openMP, graphics processor (GPU) toolkits, and cluster kits, such as MPI, as well as others that crop up from time to time. Learning how to use these toolkits can range from simple to complex. Getting the most out of each toolkit is an exercise for the reader. I will cover some basics about using each one, and run some performance tests to compare the different toolkits. This will more or less be an introduction, but I'm hoping that it will fuel some ideas in whomever reads this, and be able to apply some of the simple concepts to your own programs.<br />
<br />
Where to start is always the hard question. I'm going to start with a compute intensive program, that does Option Pricing. Lots of floating point calculations, lots of experiments to see what the best price might be over the years, and these calculations need to be run over many option positions. There are differences in this program and real option pricing programs. This program processes a number of experiments, and the experiments are set using a random number generation, with some approximation calculations. Real option pricing programs, I believe, will use a Stochastic distribution calculation to seed the experiments, so my program is not going to be accurate. That means you should not use this program to try to price any options (that's the warning). This program is for testing and performance testing only.<br />
<br />
The program performs 1 million experiments (option price calculations), and also performs this 1024 times, to simulate a portfolio of this many options that need to be priced. As I am going to explore parallelism in future posts, I decided to arrange the data into arrays. Thinking about parallelism has everything to do with the data, and how you handle it. I hope the arrays I've used will be adaptable to all the toolkits that will be explored, but time will tell. The program also only prints minimal data out at the end. I only print the last experiment, just to show that the calculations are actually done. A real pricing program would probably spit out all calculations for all experiments, or analyze and print out the optimal prices and maturity, and it would do this for all option positions in the portfolio. But I'm more interested in execution time.<br />
<br />
The environment that the baseline program is run in, and where the results are from is an Intel Core2 Quad (Q6600) processor running at 2.4Ghz. The machine has 4 gigs of ram, 250gig hard drive, and a graphics processor. The operating system is CentOS 5.2 with all the updates applied. Not a bad little machine, lots of horsepower, you say! As the baseline program is only a single thread of execution, it runs substantially long on this machine as the performance run will show. One core of the quad processor is pegged at 100%, while the other 3 sit idly around doing a little housekeeping here and there. The program gets the job done, but not very efficiently.<br />
<br />
So let's begin. Attached is a zip file containing the program and the Makefile.<br />
<br />
<a href="http://www.hpccommunity.org/attachment.php?attachmentid=7" >Attachment 7</a> <br />
 <br />
<br />
The test run was done using the Linux &quot;Time&quot; command to time the execution. I'm more interested in the overall program execution time, as opposed to some benchmarks I've seen that only time inner calculations. I mean when you're waiting for results it's the whole program that counts. It shows the timed execution on the Core2 Quad 2.4Ghz processor.<br />
<br />
[leo@compute70 Baseline]$ time ./optionprice<br />
=== Option Portfolio Calculations (Basline Test) ==========<br />
Portfolio size                         : 1024<br />
Experiments run per item   : 1000000<br />
Average Call Price               : 36.868640<br />
Average Put  Price               : 2.147528<br />
<br />
real    18m49.352s<br />
user    18m49.075s<br />
sys    0m0.090s<br />
[leo@compute70 Baseline]$ <br />
<br />
<br />
Here is a picture of &quot;top&quot; running to show that the execution of the program only exercises 1 core of the cpu.<br />
<br />
<a href="http://www.hpccommunity.org/attachments/f13/186d1218470894-integrating-symphony-de-matlab-parallel-computing-toolbox-baseline.jpg" id="attachment186" rel="Lightbox_83" ><img src="http://www.hpccommunity.org/attachments/f13/186d1218470894t-integrating-symphony-de-matlab-parallel-computing-toolbox-baseline.jpg" border="0" alt="Click image for larger version

Name:	baseline.jpg
Views:	571
Size:	64.2 KB
ID:	186" class="thumbnail" /></a> <br />
 <br />
<br />
Well there you go, almost 19 minutes to perform all the calculations. You might say that's not so bad, I can go have a coffee while that's going on. But suppose you are the person responsible for managing this portfolio, and you have to make pricing decisions quickly. I doubt saying &quot;come back in 20, while I figure this out&quot;, is acceptable. And this is only a small portfolio, with minimal experiments to determine optimal pricing.<br />
<br />
Over the next few posts, i'll look at how to improve this execution time. Sure I can optimize the program, use compiler optimizations to hopefully speed it up, but I want bigger speed ups then that.<br />
<br />
Talk to you soon.<br />
<br />
<br />
Leo Stutzmann</blockquote>


<!-- attachments -->
	<div class="blogattachments">
		
		
		
		
			<fieldset class="blogcontent">
				<legend>Attached Files</legend>
				<ul>
					
				</ul>
			</fieldset>
		

	</div>
<!-- / attachments -->
 ]]></content:encoded>
			<dc:creator>Bearcat</dc:creator>
			<guid isPermaLink="true">http://www.hpccommunity.org/blogs/bearcat/exploring-hpc-programming-where-start-83/</guid>
		</item>
		<item>
			<title>Using Memory Mapped Files</title>
			<link>http://www.hpccommunity.org/blogs/ajith/using-memory-mapped-files-82/</link>
			<pubDate>Tue, 22 Jul 2008 20:39:12 GMT</pubDate>
			<description>_*Use Case*_    
 
  I recently ran into a situation in which I needed my Symphony services to share some intermediate data running on the same host....</description>
			<content:encoded><![CDATA[<blockquote class="blogcontent restore"><u><b>Use Case</b></u>   <br />
<br />
  I recently ran into a situation in which I needed my Symphony services to share some intermediate data running on the same host. There was a large amount of data and I didn’t want each service instance to have its own private copy. Having private copies will use up all the free memory on the host. I also wanted all processes to both read and write to this memory. I had heard about memory mapped files but never had a chance to use them. This seemed like a good opportunity to try them out.<br />
   <br />
  <u><b>What are Memory Mapped Files?</b></u><br />
   <br />
  A memory mapped file is just a file that’s mapped into virtual memory (the process’s private window into physical memory). To create a memory mapped file, first create the file, then map all or part of the file using C API functions like UNIX’s mmap() and MS Window’s CreateFileMapping(). Mmap will map the file into virtual memory. This just means that it reserves a block of addresses in virtual memory corresponding to the file's size. No physical memory is allocated until the virtual memory is read or written. Memory mapped files can be used to access files randomly using memory operations or dereferencing pointers. <br />
   <br />
  The operating system typically gives the process a handle to the virtual memory mapped into the kernel’s file cache and does not count against a 32-bit process’s 2 Gb virtual limit. If the file is unmapped from virtual space in one process, it is still cached in physical memory for other processes until something else causes the modified memory to get evicted from the cache and copied back into the file. This works in a similar way to the paging file. The only draw-back to this lazy file write is if the host crashes. The contents of memory that are not flushed to disk explicitly, are lost. <br />
   <br />
  The operating system will read in 4 kb (typical page size) chunks of a file into physical memory at a time. This reading is driven by the process's access to un-paged memory. If virtual memory is read or written that doesn't have physical memory backing it, a page fault occurs, a physical page is assigned and a 4 kb part of the file is read into this location.<br />
<br />
Improved performance is achieved through deferred reading of the file and lazy writing of the file. Less physical memory is required if the process only needs to access a few 4 kb chunks in the file. As well, files mapped into memory can be shared by other processes running on the same host. <br />
   <br />
  <u><b>Using Memory Mapped Files</b></u><br />
   <br />
  There are 3 main models to use memory mapped files for memory sharing;<br />
  <ol class="decimal"><li>Create one file and grow the      file when you need to add data</li>
<li>Create a large file and add      data until there’s not enough space left</li>
<li>Create one file for each data      entry</li>
</ol>   The benefit of option 1 is that there is only one file (unless the file size exceeds the available virtual memory). The problem is that other processes that map the memory will need special code to extend the mapping that may cause the whole virtual memory block to get relocated causing all current references into the original memory to become invalid.<br />
   <br />
  The benefit of option 2 is that the virtual memory block is a fixed size and won’t get relocated. So there are no worries about invalidating current references. Creating a fixed buffer is not optimal, as we never know how much memory we need in advance.<br />
   <br />
  Option 3 is the most flexible and memory efficient. We only map the file memory into virtual space when we need it to read or write. The only drawback is that we end up with one backing file for every buffer we create. I chose option 3.<br />
   <br />
  Once memory is shared, we run into synchronization issues. Any process requiring write access will have to hold a process mutex that blocks other reads and writes. <br />
   <br />
  I’ll get into the coding details of the implementation in my next blog.<br />
   <br />
  <u><b>Summary</b></u><br />
   <br />
  Memory mapped files can optimize file access by using a lazy read and write operations. File access through memory operations is faster than doing kernel calls to the file system. Mapped files use the kernel’s virtual space and don’t count against the 32-bit 2 Gb user process memory limit. Performance won’t be very good if the process needs to access memory sequentially, as each read or write of a new 4kb memory page will cause a fault and a file access.<br />
   <br />
  Memory mapped files are useful for sharing memory between processes. Each process will map the same view of a file and can immediately access data written by another process.<br />
   <br />
<u><b>   References</b></u><br />
   <br />
  <a href="http://en.wikipedia.org/wiki/Memory-mapped_file" target="_self">Memory-mapped file - Wikipedia, the free encyclopedia</a></blockquote>

 ]]></content:encoded>
			<dc:creator>Ajith</dc:creator>
			<guid isPermaLink="true">http://www.hpccommunity.org/blogs/ajith/using-memory-mapped-files-82/</guid>
		</item>
		<item>
			<title>Symphony Articles</title>
			<link>http://www.hpccommunity.org/blogs/ajith/symphony-articles-81/</link>
			<pubDate>Fri, 18 Jul 2008 17:31:20 GMT</pubDate>
			<description>I moved some of the entries in my blog to the  Symphony Technical Articles area. These include the following; 
 
*  Service Deployment in Symphony DE...</description>
			<content:encoded><![CDATA[<blockquote class="blogcontent restore">I moved some of the entries in my blog to the <a href="http://www.hpccommunity.org/f75/" target="_self"> Symphony Technical Articles</a> area. These include the following;<br />
<ul><li><a href="http://www.hpccommunity.org/f75/service-deployment-symphony-de-291/" target="_self"> Service Deployment in Symphony DE</a></li>
<li><a href="http://www.hpccommunity.org/f75/symphony-de-application-performance-287/" target="_self">Symphony DE Application Performance</a></li>
</ul><br />
The other entries I had can be found in the following areas;<br />
<ul><li><a href="http://www.hpccommunity.org/f77/" target="_self"> Administration and Configuration Tips</a></li>
<li><a href="http://www.hpccommunity.org/f76/" target="_self"> Development and Debugging Tips</a></li>
</ul>I'll use this blog to discuss programming for HPC applications primarily using Symphony DE.<br />
<br />
- Ajith</blockquote>

 ]]></content:encoded>
			<dc:creator>Ajith</dc:creator>
			<guid isPermaLink="true">http://www.hpccommunity.org/blogs/ajith/symphony-articles-81/</guid>
		</item>
		<item>
			<title>Certification for HPC System Administration</title>
			<link>http://www.hpccommunity.org/blogs/csmith/certification-hpc-system-administration-80/</link>
			<pubDate>Wed, 02 Jul 2008 18:50:58 GMT</pubDate>
			<description><![CDATA[I saw this post over at ClusterMonkey. It's about a new training program at Georgetown's Advanced Research Computing that is intended to train people...]]></description>
			<content:encoded><![CDATA[<blockquote class="blogcontent restore">I saw <a href="http://www.clustermonkey.net//content/view/231/2/" target="_self">this post</a> over at <a href="http://clustermonkey.org" target="_self">ClusterMonkey</a>. It's about a new training program at Georgetown's Advanced Research Computing that is intended to train people to be system administrators for HPC systems. I think this is a good idea. There is enough difference between running general purpose systems and HPC systems, and a lot of the tuning can be quite different, so preparing people for this particular kind of specialization is well due.</blockquote>

 ]]></content:encoded>
			<dc:creator>csmith</dc:creator>
			<guid isPermaLink="true">http://www.hpccommunity.org/blogs/csmith/certification-hpc-system-administration-80/</guid>
		</item>
		<item>
			<title>Structure08</title>
			<link>http://www.hpccommunity.org/blogs/csmith/structure08-79/</link>
			<pubDate>Fri, 27 Jun 2008 18:58:46 GMT</pubDate>
			<description><![CDATA[This week I attended the Structure08 conference held here in San Francisco. The theme of the conference was "Cloud" and the topics ranged from issues...]]></description>
			<content:encoded><![CDATA[<blockquote class="blogcontent restore">This week I attended the <a href="http://events.gigaom.com/structure/08/" target="_self">Structure08</a> conference held here in San Francisco. The theme of the conference was &quot;Cloud&quot; and the topics ranged from issues scaling out web applications to getting your cloud startup funded. <br />
<br />
I particularly enjoyed Werner Vogels keynote on <a href="http://aws.amazon.com" target="_self">Amazon Web Services</a>, and the way that Amazon got into the business of &quot;Infrastructure as a Service (IaaS)&quot;. There were also some very good panel sessions. Overall, I liked the format of the conference and the breadth of content, although I think that the notion of &quot;cloud&quot; is already becoming over hyped, with everybody and their dog wanting to be part of the cloud ecosystem. I guess that's just part of the industry we're part of. :)<br />
<br />
You can see summaries of the Structure08 sessions at the <a href="http://gigaom.com/2008/06/25/live-coverage-of-structure-08/" target="_self">GigaOM Structure08 live coverage post</a>.</blockquote>

 ]]></content:encoded>
			<dc:creator>csmith</dc:creator>
			<guid isPermaLink="true">http://www.hpccommunity.org/blogs/csmith/structure08-79/</guid>
		</item>
		<item>
			<title>Multi-Core and GPU Background</title>
			<link>http://www.hpccommunity.org/blogs/bearcat/multi-core-gpu-background-77/</link>
			<pubDate>Fri, 27 Jun 2008 17:26:26 GMT</pubDate>
			<description>High Performance programs, typically do the same thing multiple times with different data, or with different parameters, such as a simulation. Real...</description>
			<content:encoded><![CDATA[<blockquote class="blogcontent restore">High Performance programs, typically do the same thing multiple times with different data, or with different parameters, such as a simulation. Real World applications, such as a program that monitors a port, gets data, processes the data, returns a result, may need to operate on multiple cores as well, but I typically call this type of multi-threading, &quot;concurrency&quot;, where your program is doing different things concurrently. This may not be everyone's definition, but I tend to look at it this way.<br />
<br />
I may indulge a little in concurrency here in the future, but for the most part we'll be looking at programs that fall into the high performance category.<br />
<br />
There are a number of technologies for high performance programming that are available or are coming out, and they fall into a few different camps. I'm going to be looking into the areas of Camp 0, 1 and 2, with a sample program and performance numbers, but first, I just wanted to provide a little background into the different areas.<br />
<br />
<b>Camp 0: (Multi-threading)</b><br />
<br />
The old tried and true method of using multiple processor or cores in a machine. You use the threading libraries available on your operating system, and manage the thread and data access yourself. Can be easy to do, or complex to do, depending on your program.<br />
<br />
<b>Camp 1: (Multi-Core)</b><br />
<br />
Toolkits that allow some parallelism in your program through meta-tagging of code segments. An example of this is: OpenMP. This tool allows you to tag sections of code for parallelism, and is compiled to the native architecture of the chip you’re using. If I compile for x86, then the resulting program runs on x86, and makes use of multiple x86 processors, if available. No special compiler needed, available in  gcc.<br />
<br />
NOTE: Camps 1 and 2 don’t work together automatically, unless you specifically  use the techniques yourself.<br />
<br />
<b>Camp 2: (GPU)</b><br />
<br />
The GPU accelerator camp, which  includes the previous generation gpgpu programming, and now CUDA, and ATI’s  Stream SDK for their new FireStream processor cards, are development kits that allow you to use the GPU as a floating point co-processing unit. While each of these get easier to use as the new versions come out, they usually require a special compiler, which compiles the code to native GPU instructions. Once the code is compiled, if run on a machine which doesn’t have the GPU, your program doesn’t work (go figure). Fast, but not general in nature. Specific compiler needed, or in the case of gpgpu programming, it emits the specific shader language instructions to the gpu. Eg. The CUDA sdk comes with an nVidia compiler, and only supports 8000 series cards and up, if you have the correct drivers.<br />
<br />
<b>Camp 3: (either Multi-Core or GPU)</b><br />
<br />
Companies such as RapidMind  have a model that tags code, and the tagged code is emitted as program text.  This text is compiled at runtime, based on the backend required. RapidMind have back-ends for x86, ppc, glsl (Shader Language used by camp 2 gpgpu). At runtime a backend is selected and the code text is compiled to the native instructions for the target processor. One target backend is used, such as  either x86 or gpu, they are not mixed together (at least in the current versions). This allows the program to be compiled, and run on any machine that has the RapidMind runtime. It will use the ATI or nVidia GPU if available, or just multi-thread the program segments on x86, if multiple cores are available. The benefit is application portability to different machines with different cpu and graphics cores. The RapidMind runtime manages access to bound and unbound variables to  eliminate the locking required in general multi-threaded programming.<br />
<br />
<b>Camp 4: (both Multi-Core and GPU) (Near Future)</b><br />
<br />
Apple seems to be taking an all approach to this. They have been a heavy participant in the LLVM compiler project, and have a number of Apple engineers working on the project. The LLVM compiler project takes your code, and generates intermediate code (think java or .net  clr). This code can then generate native code using one of the LLVM code generation back-ends.<br />
<br />
Apple has used this technology successfully in Mac OS X Leopard 10.5, in the new implementation of their OpenGL (note the GL).  They compile the opengl libraries, that get installed to your system. Then when opengl is used on your system the Just In Time backend for the graphics card in your system compiles the intermediate code to native code for your machine, and placed in caches on your hard drive for later execution.<br />
<br />
This appears to be the same technology that they will use in their OpenCL (Open Computing Language, using the GPU for calculations) announcement, for Mac OS X Snow Leopard 10.6, delivered next year. Sections of code destined for parallelism and the GPU will be compiled by LLVM to intermediate, and then Just In Time compiled by the backend to provide native GPU instructions for the execution.<br />
<br />
Additionally the announcement of Grand Central Dispatch to tag program segments for parallelism  and run on multi-cores seems similar to openMP. Apple states in their announcement that you will be able to use GPUs and CPUs at the same time in your parallel segments. This implies that the same LLVM is used, and a different backend to Just In Time compile for x86 native execution.<br />
<br />
This camp will allow the JIT compile for multiple backends  and their execution control in the Grand Central Dispatch environment. This is an additional step up, from Camp 3.<br />
<br />
This is impressive for a couple of reasons. First it implies that they package up execution units and dispatch them to different processor cores or gpus as needed, and second, they are baking this into their own developer tools, and the operating system. The required runtimes will always be available to these types of compiled applications. Let's hope this lives up to the hype, I've just given it.<br />
<br />
<b>Camp X: (Cluster) (Far Future)</b><br />
<br />
How do we get program segments running in a cluster? A problem someone will solve, I'm sure.<br />
<br />
Leo</blockquote>

 ]]></content:encoded>
			<dc:creator>Bearcat</dc:creator>
			<guid isPermaLink="true">http://www.hpccommunity.org/blogs/bearcat/multi-core-gpu-background-77/</guid>
		</item>
		<item>
			<title>Who Am I</title>
			<link>http://www.hpccommunity.org/blogs/bearcat/who-am-i-76/</link>
			<pubDate>Fri, 27 Jun 2008 16:26:31 GMT</pubDate>
			<description>My name is Leo Stutzmann, and I am an Architect at Platform Computing. I am in the research group, and tend to look at things related to the...</description>
			<content:encoded><![CDATA[<blockquote class="blogcontent restore">My name is Leo Stutzmann, and I am an Architect at Platform Computing. I am in the research group, and tend to look at things related to the developer. That means tools, compilers, coding methods, etc within the area of High Performance Computing. <br />
<br />
One area, I am looking at is: <b>Multi-Core and GPU issues</b>. This may expand into other co-processors. Another area is <b>Parallel programming languages, Models, and Tools</b>. These areas loosely correspond to 2 areas identified in Khalid's wonderful summary here:<br />
<br />
<a href="http://www.hpccommunity.org/blogs/khalid/research-topics-hpc-65/" target="_self">Research Topics in HPC - HPC Community - High Performance Computing (HPC) Community</a><br />
<br />
A lot of what you'll see, will be thoughts, examples, tests, performance numbers, etc. I will usually work with single system hardware I have, or machines I can get my hands on. I will try to stick to common hardware, so you can try these things yourself, and see if some of these technologies benefit your own programming.<br />
<br />
cheers for now,<br />
Leo</blockquote>

 ]]></content:encoded>
			<dc:creator>Bearcat</dc:creator>
			<guid isPermaLink="true">http://www.hpccommunity.org/blogs/bearcat/who-am-i-76/</guid>
		</item>
		<item>
			<title>How to Put Album Image onto Your Blog?</title>
			<link>http://www.hpccommunity.org/blogs/vbseo/how-put-album-image-onto-your-blog-73/</link>
			<pubDate>Thu, 12 Jun 2008 07:59:20 GMT</pubDate>
			<description>Please check out How to upload images to your albums, in order to have a clear idea on how to use your Albums. 
 
In this article, we will be looking...</description>
			<content:encoded><![CDATA[<blockquote class="blogcontent restore">Please check out <a href="http://www.hpccommunity.org/blogs/vbseo/how-upload-images-album-72/" target="_self">How to upload images to your albums</a>, in order to have a clear idea on how to use your Albums.<br />
<br />
In this article, we will be looking at how to put images from your album to your blog post. <br />
<br />
Here's how:<br />
<br />
1) Go to your album. (More on : <i><a href="http://www.hpccommunity.org/blogs/vbseo/how-upload-images-album-72/" target="_self">How to upload images to your albums</a></i>)<br />
<br />
2) Click on the image that you want to put on your blog.<br />
<br />
3) You should be able to see the following screen.<br />
<img src="http://www.hpccommunity.org/members/vbseo/albums/article-post/22-viewing-image-within-album.jpg" border="0" alt="" /><br />
<br />
4) <b><i>Highlight and Copy the BBCode</i></b> and paste it in your blog post.<br />
<img src="http://www.hpccommunity.org/members/vbseo/albums/article-post/21-include-bbcode-my-blog-post.jpg" border="0" alt="" /><br />
<br />
Here's the final result:<br />
<img src="http://www.hpccommunity.org/members/vbseo/albums/article-post/20-final-result-putting-image-my-blog-post.jpg" border="0" alt="" /></blockquote>

 ]]></content:encoded>
			<dc:creator>vbseo</dc:creator>
			<guid isPermaLink="true">http://www.hpccommunity.org/blogs/vbseo/how-put-album-image-onto-your-blog-73/</guid>
		</item>
		<item>
			<title>How To Upload Images to Album?</title>
			<link>http://www.hpccommunity.org/blogs/vbseo/how-upload-images-album-72/</link>
			<pubDate>Thu, 12 Jun 2008 06:54:00 GMT</pubDate>
			<description>The Album feature is here specially to benefit our HPCCommunity bloggers with the ability to be able to upload and administer their images with...</description>
			<content:encoded><![CDATA[<blockquote class="blogcontent restore">The Album feature is here specially to benefit our HPCCommunity bloggers with the ability to be able to upload and administer their images with greater ease and control.<br />
<br />
With this feature, bloggers will be able to save images and use them on our HPCCommunity blogs posting.<br />
<br />
Basically this article, will run through 3 areas:<ol class="decimal"><li>As a blogger, where to find your Album feature?</li>
<li>How to create an album?</li>
<li>How to upload images to your album?</li>
</ol><br />
Here's a short run through on how to use the Album:<br />
<br />
<font color="RoyalBlue"><b>Where is my Pictures &amp; Albums?</b></font><br />
<br />
1) Go to <b><i>Quick Links > User Control Panel</i></b>.<br />
<br />
2) Go to <b><i>Networking section > Pictures &amp; Albums</i></b> under Your Control Panel. <br />
<img src="http://www.hpccommunity.org/members/vbseo/albums/article-post/14-user-control-panel.gif" border="0" alt="" /><br />
<br />
<font color="RoyalBlue"><b>How to Create an Album?</b></font><br />
<br />
1) Click on "<b><i>Click here to add an album and start uploading images!</i></b>"<br />
<img src="http://www.hpccommunity.org/members/vbseo/albums/article-post/12-album-feature.jpg" border="0" alt="" /><br />
<br />
2) Input Your <b><i>Title</i></b> and <b><i>Description </i></b>for the album.<br />
Leave the Album Type for public, so that it will be visible to public.<br />
<img src="http://www.hpccommunity.org/members/vbseo/albums/article-post/11-adding-new-album.jpg" border="0" alt="" /><br />
<br />
Congrats! You have succesfully created your album, and we can start uploading images. <br />
<br />
<font color="RoyalBlue"><b>How to Upload Images to Album?</b></font><br />
<br />
1) Click on "<b><i>Click here to upload pictures!</i></b>".<br />
<img src="http://www.hpccommunity.org/members/vbseo/albums/article-post/13-click-upload-your-image.jpg" border="0" alt="" /><br />
<br />
2) Upload your images via the <b><i>Browse</i></b> and confirm with <b><i>Upload Pictures</i></b> button.<br />
<img src="http://www.hpccommunity.org/members/vbseo/albums/article-post/17-upload-pictures.jpg" border="0" alt="" /><br />
<br />
3) Input your <b><i>caption</i></b> to describe your uploaded image. Click on "<b><i>Save Changes</i></b>" to confirm your changes.<br />
<img src="http://www.hpccommunity.org/members/vbseo/albums/article-post/16-editing-pictures.jpg" border="0" alt="" /></blockquote>

 ]]></content:encoded>
			<dc:creator>vbseo</dc:creator>
			<guid isPermaLink="true">http://www.hpccommunity.org/blogs/vbseo/how-upload-images-album-72/</guid>
		</item>
		<item>
			<title>OGF23 Highlights</title>
			<link>http://www.hpccommunity.org/blogs/csmith/ogf23-highlights-70/</link>
			<pubDate>Tue, 10 Jun 2008 18:04:48 GMT</pubDate>
			<description>OGF23 was held last week in Barcelona in conjunction with the BEinGRID industry days events. OGF events, in addition to having great program content,...</description>
			<content:encoded><![CDATA[<blockquote class="blogcontent restore"><a href="http://ogf.org/OGF23/" target="_self">OGF23</a> was held last week in Barcelona in conjunction with the <a href="http://www.beingrid.com" target="_self">BEinGRID</a> industry days events. OGF events, in addition to having great program content, are an excellent way to keep in touch with Grid computing practitioners all over the world, and OGF23 was no exception. <br />
<br />
Some of the highlights for me.....<br />
<br />
<b>Standards Convergence</b><br />
<br />
For a number of years now I have worked on specifications for interfaces to job management systems in the form of JSDL, OGSA-BES, HPC Profile and friends. We're now to the point where we have multiple implementations of these specifications, such as the <a href="http://www.hpccommunity.org/f47/" target="_self">BES++</a> project. What we had &quot;punted&quot; on for quite a while is a comprehensive information model that could be used to describe the resources available to run jobs on. This gap is now possible to fill with the publishing of the GLUE schema as a <a href="http://www.ogf.org/gf/docs/?public_comment" target="_self">public comment document</a>! While standards in general are not designed to excite, seeing all these bits and pieces come together after a long time is very satisfying for those of us who have been working on them. Good work GLUE-WG!<br />
<br />
<b>Cloud Workshop and Sessions</b><br />
<br />
Cloud was a big theme of the conference, with some great content in the form of a keynote from Werner Vogels (CTO of Amazon), and with 2 workshop sessions on cloud with presentations from Cohesive FT and CERN among others. It's still not clear how Grids and Clouds converge, but intuitively there is an intersection point somewhere. <br />
<br />
<b>OGF-Europe</b><br />
<br />
 OGF-Europe is an EU funded project that is intended to help collect information on the use of Grid in Europe, as well as promote the use of Grid across multiple industries and customer sizes. As somebody who works on standards, it is easy to get wrapped up in the small details of how things work, and to see OGF as providing a valuable venue for doing this work. What is hard is to see the bigger picture around Grid usage and value to organizations. OGF-Europe will definitely provide some much needed external (to OGF) facing activities. Hopefully they will create some demand for our specs!<br />
<br />
<b>Green IT Workshop</b><br />
<br />
The <a href="http://grid.globalwatchonline.com/epicentric_portal/site/GRID/?mode=0" target="_self">Grid Computing Now</a> team organized 2 workshops on the opportunity to use Grid techniques to make data centres more efficient and environmentally friendly. <br />
<br />
<b>Data Management Sessions</b><br />
<br />
The session presentations can be found <a href="http://ogf.org/gf/event_schedule/index.php?id=1271" target="_self">here.</a> The European Grid community presented on the various techniques that they have used to deal with very large and distributed data sets, for both files and relational data. I never knew that anybody was using Oracle RAC on over 140 servers!<br />
<br />
<br />
For some other perspectives on the conference (and some video highlights) check out the <a href="http://gridtalk-project.blogspot.com/" target="_self">GridCast at OGF23</a> blog.</blockquote>

 ]]></content:encoded>
			<dc:creator>csmith</dc:creator>
			<guid isPermaLink="true">http://www.hpccommunity.org/blogs/csmith/ogf23-highlights-70/</guid>
		</item>
		<item>
			<title>Introducing myself....</title>
			<link>http://www.hpccommunity.org/blogs/csmith/introducing-myself-69/</link>
			<pubDate>Tue, 10 Jun 2008 17:23:19 GMT</pubDate>
			<description><![CDATA[For my first blog entry I'd like to take the time to introduce myself. My name is Chris Smith, and I'm a product architect at Platform Computing....]]></description>
			<content:encoded><![CDATA[<blockquote class="blogcontent restore">For my first blog entry I'd like to take the time to introduce myself. My name is Chris Smith, and I'm a product architect at Platform Computing. What does this really mean? I spend time looking at what sorts of problems people are trying to address in the Grid and HPC community, and I try to figure out how to either apply Platform's technologies to solve those problems, or look at new technologies that can be used with Platform's technologies and know how to solve the problem. <br />
<br />
I also am very involved in the <a href="http://www.ogf.org" target="_self">Open Grid Forum</a>, where I have attended for a number of years engaged in writing specifications for interoperable job management. I also happen to be the VP of the standards function of the OGF, meaning I help organize a great group of people, who together manage the working groups and the document process of the OGF. <br />
<br />
On this blog, I hope to talk a little bit about my research interest areas, as well as talk a little bit about what is going on in the world of standards.</blockquote>

 ]]></content:encoded>
			<dc:creator>csmith</dc:creator>
			<guid isPermaLink="true">http://www.hpccommunity.org/blogs/csmith/introducing-myself-69/</guid>
		</item>
	</channel>
</rss>
