PVFS version 2, first try
Let's play with PVFS (Parallel Virtual File System), and perform a manual install on an OCS 5 cluster. If we want to build a kit, then we should be able to recompile the code (+dependencies), understand the configuration, and of course how to test it, right ?
Two versions of are available; I will use version 2. The homepage is here:
Parallel Virtual File System, Version 2
So, let's start with the latest and greatest version: PVFS-2.7.1
Okay, it needs BerkeleyDB ... well just get the latest and greatest version (4.7).
PVFS will compile just fine but surprise: I got the following during the initialization (in my case it was on the metadata server):
###############################################
[root@compute-00-02 ~]# /opt/pvfs/sbin/pvfs2-server /etc/pvfs-fs.conf -f
[S 08/18 14:57] PVFS2 Server on node compute-00-02 version 2.7.1 starting...
[E 08/18 14:57] TROVE:DBPF:Berkeley DB: DB_THREAD mandates memory allocation flag on key DBT
[E 08/18 14:57] error in dspace create (db_p->get failed).
[E 08/18 14:57] TROVE:DBPF:Berkeley DB: DB_THREAD mandates memory allocation flag on key DBT
[E 08/18 14:57] error in dspace create (db_p->get failed).
[E 08/18 14:57] TROVE:DBPF:Berkeley DB: DB_THREAD mandates memory allocation flag on key DBT
[E 08/18 14:57] error in dspace create (db_p->get failed).
[E 08/18 14:57] TROVE:DBPF:Berkeley DB: DB_THREAD mandates memory allocation flag on key DBT
[E 08/18 14:57] error in dspace create (db_p->get failed).
[D 08/18 14:57] PVFS2 Server: storage space created. Exiting.
###############################################
Not really good ... well ... we can fall back to a previous version, like 4.6.21, right ?
Anyway, it is really easy to recompile, just go in the source dir and:
--------------------------------------------------------
cd build_unix ; ../dist/configure --prefix=/home/mbozzore/db-4.6.21 ; make make install
--------------------------------------------------------
Okay, ready to recompile (again) PVFS:
--------------------------------------------------------
[mbozzore@stakhanov pvfs-2.7.1]$ ./configure --prefix=/home/mbozzore/pvfs2 --with-db=/home/mbozzore/db-4.6.21/ --with-kernel=/usr/src/kernels/2.6.18-53.el5-x86_64/--------------------------------------------------------
And at the end of the process:
###############################################
***** Displaying PVFS2 Configuration Information *****
------------------------------------------------------
PVFS2 configured to build karma gui : no
PVFS2 configured to use epoll : yes
PVFS2 configured to perform coverage analysis : no
PVFS2 configured for aio threaded callbacks : yes
PVFS2 configured for the 2.6.x kernel module : yes
PVFS2 configured for the 2.4.x kernel module : no
PVFS2 configured for using the mmap-ra-cache : no
PVFS2 configured for using trusted connections : no
PVFS2 configured for a thread-safe client library : yes
PVFS2 will use workaround for redhat 2.4 kernels : no
PVFS2 will use workaround for buggy NPTL : no
PVFS2 server will be built : yes
###############################################
Not sure if not beeing able to build any karma at all is a good thing or not, we will see ;)
And then, you just need something like:
--------------------------------------------------------
make all kmod
--------------------------------------------------------
The next step is to recompile Open MPI with support for PVFS.
Second surprise, Open MPI just blows up with the following information:
###############################################
io_romio_ad_pvfs2_open.c: In function 'fake_an_open':
io_romio_ad_pvfs2_open.c:86: warning: passing argument 6 of 'PVFS_sys_create' from incompatible pointer type
io_romio_ad_pvfs2_open.c:86: error: too few arguments to function 'PVFS_sys_create'
make[5]: *** [io_romio_ad_pvfs2_open.lo] Error 1
make[5]: Leaving directory `/home/mbozzore/trunk/src/kits/platform_hpc/packages/openmpi-interconnects-gnu/openmpi-1.2.4/ompi/mca/io/romio/romio/adio/ad_pvfs2'
make[4]: *** [all-recursive] Error 1
###############################################
Nice, the reference is here:
Open MPI Development Mailing List Archives
So, just step back and try another version, like PVFS 2.6.3, and BerkeleyDB 4.6.21
Note: I have a "non standard" config. My workstation is an OCS 5 installer and my nodes are VMs. The following is also exported (from installer to compute nodes), with the no_root_squash option : /home ; /VM/mbozzore
Of course, if you go through the same process, you will experience the fact that PVFS 2.7.x and 2.6.x are slightly different (for example, 2 arguments are needed for pvfs2-genconfig (only one for version 2.7.1))
For the config part, I used:
--------------------------------------------------------
[mbozzore@stakhanov pvfs-2.6.3]$ ./configure --prefix=/home/mbozzore/pvfs-2.6.3 --with-kernel=/usr/src/kernels/2.6.18-53.el5-x86_64/ --enable-shared --with-db=/home/mbozzore/db-4.6.21/
--------------------------------------------------------
And again:
###############################################
***** Displaying PVFS2 Configuration Information *****
------------------------------------------------------
PVFS2 configured to build karma gui : no
PVFS2 configured to use epoll : yes
PVFS2 configured to perform coverage analysis : no
PVFS2 configured for aio threaded callbacks : yes
PVFS2 configured for the 2.6.x kernel module : yes
PVFS2 configured for the 2.4.x kernel module : no
PVFS2 configured for using the mmap-ra-cache : no
PVFS2 configured for using trusted connections : no
PVFS2 configured for a thread-safe client library : yes
PVFS2 will use workaround for redhat 2.4 kernels : no
PVFS2 will use workaround for buggy NPTL : no
PVFS2 server will be built : yes
###############################################
As usual:
--------------------------------------------------------
make
make kmod
make install
--------------------------------------------------------
Note that by default make install will install pvfs2-server in the sbin directory, but no client at all (actually you need 2 binaries for the client part).
On the PVFS side, I'd like to use the following configuration:
--------------------------------------------------------
compute-00-00, compute-00-01 : IO nodes
compute-00-02 : metadata server
compute-00-03 : client only (compute-00-00, compute-00-01, compute-00-02 also clients)
--------------------------------------------------------
In order to achieve that, I need to generate the config files, using the pvfs2-genconfig script; it can be used interactively:
###############################################
[root@stakhanov ~]# /home/mbozzore/pvfs-2.6.3/bin/pvfs2-genconfig /etc/pvfs2/pvfs2-fs.conf /etc/pvfs2/pvfs2-server.conf
************************************************** ********************
Welcome to the PVFS2 Configuration Generator:
This interactive script will generate configuration files suitable
for use with a new PVFS2 file system. Please see the PVFS2 quickstart
guide for details.
************************************************** ********************
You must first select the network protocol that your file system will use.
The only currently supported options are "tcp", "gm", and "ib".
(For multi-homed configurations, use e.g. "ib,tcp".)
* Enter protocol type [Default is tcp]:
Choose a TCP/IP port for the servers to listen on. Note that this
script assumes that all servers will use the same port number.
* Enter port number [Default is 3334]:
Choose a directory for each server to store data in.
* Enter directory name: [Default is /pvfs2-storage-space]:
Choose a file for each server to write log messages to.
* Enter log file location [Default is /tmp/pvfs2-server.log]: /var/log/pvfs2-server.log
Next you must list the hostnames of the machines that will act as
I/O servers. Acceptable syntax is "node1, node2, ..." or "node{#-#,#,#}".
* Enter hostnames [Default is localhost]: compute-00-00, compute-00-01
Now list the hostnames of the machines that will act as Metadata
servers. This list may or may not overlap with the I/O server list.
* Enter hostnames [Default is localhost]: compute-00-02
Configured a total of 3 servers:
2 of them are I/O servers.
1 of them are Metadata servers.
* Would you like to verify server list (y/n) [Default is n]? y
****** I/O servers:
compute-00-01
compute-00-00
****** Metadata servers:
compute-00-02
* Does this look ok (y/n) [Default is y]?
Writing fs config file... Done.
Writing 3 server config file(s)... Done.
###############################################
So, the good news is : multi-homed configs are supported:
###############################################
You must first select the network protocol that your file system will use.
The only currently supported options are "tcp", "gm", and "ib".
(For multi-homed configurations, use e.g. "ib,tcp".)
###############################################
Looks great, will try it later (IB).
The script did generate few files (you will not get the same number of files with PVFS 2.7.1)
The files generated are:
--------------------------------------------------------
[root@stakhanov ~]# ls /etc/pvfs2/
pvfs2-fs.conf pvfs2-server.conf-compute-00-00 pvfs2-server.conf-compute-00-01 pvfs2-server.conf-compute-00-02
--------------------------------------------------------
Basically, one main config file and one file per server (IO or Metadata)
So, what's inside ?
###############################################
[root@stakhanov ~]# cat /etc/pvfs2/pvfs2-fs.conf
<Defaults>
UnexpectedRequests 50
EventLogging none
LogStamp datetime
BMIModules bmi_tcp
FlowModules flowproto_multiqueue
PerfUpdateInterval 1000
ServerJobBMITimeoutSecs 30
ServerJobFlowTimeoutSecs 30
ClientJobBMITimeoutSecs 300
ClientJobFlowTimeoutSecs 300
ClientRetryLimit 5
ClientRetryDelayMilliSecs 2000
</Defaults>
<Aliases>
Alias compute-00-00 tcp://compute-00-00:3334
Alias compute-00-01 tcp://compute-00-01:3334
Alias compute-00-02 tcp://compute-00-02:3334
</Aliases>
<Filesystem>
Name pvfs2-fs
ID 398972398
RootHandle 1048576
<MetaHandleRanges>
Range compute-00-02 4-1431655767
</MetaHandleRanges>
<DataHandleRanges>
Range compute-00-00 1431655768-2863311531
Range compute-00-01 2863311532-4294967295
</DataHandleRanges>
<StorageHints>
TroveSyncMeta yes
TroveSyncData no
</StorageHints>
</Filesystem>
[root@stakhanov ~]# cat /etc/pvfs2/pvfs2-server.conf-compute-00-00
StorageSpace /pvfs2-storage-space
HostID "tcp://compute-00-00:3334"
LogFile /var/log/pvfs2-server.log
[root@stakhanov ~]# cat /etc/pvfs2/pvfs2-server.conf-compute-00-01
StorageSpace /pvfs2-storage-space
HostID "tcp://compute-00-01:3334"
LogFile /var/log/pvfs2-server.log
[root@stakhanov ~]# cat /etc/pvfs2/pvfs2-server.conf-compute-00-02
StorageSpace /pvfs2-storage-space
HostID "tcp://compute-00-02:3334"
LogFile /var/log/pvfs2-server.log
###############################################
Wow, I just got the following when trying to get a preview of my post:
The following errors occurred with your submission:
The text that you have entered is too long (20231 characters). Please shorten it to 20000 characters long.
I knew it: no karma at all ;)
Mehdi Bozzo-Rey
Two versions of are available; I will use version 2. The homepage is here:
Parallel Virtual File System, Version 2
So, let's start with the latest and greatest version: PVFS-2.7.1
Okay, it needs BerkeleyDB ... well just get the latest and greatest version (4.7).
PVFS will compile just fine but surprise: I got the following during the initialization (in my case it was on the metadata server):
###############################################
[root@compute-00-02 ~]# /opt/pvfs/sbin/pvfs2-server /etc/pvfs-fs.conf -f
[S 08/18 14:57] PVFS2 Server on node compute-00-02 version 2.7.1 starting...
[E 08/18 14:57] TROVE:DBPF:Berkeley DB: DB_THREAD mandates memory allocation flag on key DBT
[E 08/18 14:57] error in dspace create (db_p->get failed).
[E 08/18 14:57] TROVE:DBPF:Berkeley DB: DB_THREAD mandates memory allocation flag on key DBT
[E 08/18 14:57] error in dspace create (db_p->get failed).
[E 08/18 14:57] TROVE:DBPF:Berkeley DB: DB_THREAD mandates memory allocation flag on key DBT
[E 08/18 14:57] error in dspace create (db_p->get failed).
[E 08/18 14:57] TROVE:DBPF:Berkeley DB: DB_THREAD mandates memory allocation flag on key DBT
[E 08/18 14:57] error in dspace create (db_p->get failed).
[D 08/18 14:57] PVFS2 Server: storage space created. Exiting.
###############################################
Not really good ... well ... we can fall back to a previous version, like 4.6.21, right ?
Anyway, it is really easy to recompile, just go in the source dir and:
--------------------------------------------------------
cd build_unix ; ../dist/configure --prefix=/home/mbozzore/db-4.6.21 ; make make install
--------------------------------------------------------
Okay, ready to recompile (again) PVFS:
--------------------------------------------------------
[mbozzore@stakhanov pvfs-2.7.1]$ ./configure --prefix=/home/mbozzore/pvfs2 --with-db=/home/mbozzore/db-4.6.21/ --with-kernel=/usr/src/kernels/2.6.18-53.el5-x86_64/--------------------------------------------------------
And at the end of the process:
###############################################
***** Displaying PVFS2 Configuration Information *****
------------------------------------------------------
PVFS2 configured to build karma gui : no
PVFS2 configured to use epoll : yes
PVFS2 configured to perform coverage analysis : no
PVFS2 configured for aio threaded callbacks : yes
PVFS2 configured for the 2.6.x kernel module : yes
PVFS2 configured for the 2.4.x kernel module : no
PVFS2 configured for using the mmap-ra-cache : no
PVFS2 configured for using trusted connections : no
PVFS2 configured for a thread-safe client library : yes
PVFS2 will use workaround for redhat 2.4 kernels : no
PVFS2 will use workaround for buggy NPTL : no
PVFS2 server will be built : yes
###############################################
Not sure if not beeing able to build any karma at all is a good thing or not, we will see ;)
And then, you just need something like:
--------------------------------------------------------
make all kmod
--------------------------------------------------------
The next step is to recompile Open MPI with support for PVFS.
Second surprise, Open MPI just blows up with the following information:
###############################################
io_romio_ad_pvfs2_open.c: In function 'fake_an_open':
io_romio_ad_pvfs2_open.c:86: warning: passing argument 6 of 'PVFS_sys_create' from incompatible pointer type
io_romio_ad_pvfs2_open.c:86: error: too few arguments to function 'PVFS_sys_create'
make[5]: *** [io_romio_ad_pvfs2_open.lo] Error 1
make[5]: Leaving directory `/home/mbozzore/trunk/src/kits/platform_hpc/packages/openmpi-interconnects-gnu/openmpi-1.2.4/ompi/mca/io/romio/romio/adio/ad_pvfs2'
make[4]: *** [all-recursive] Error 1
###############################################
Nice, the reference is here:
Open MPI Development Mailing List Archives
So, just step back and try another version, like PVFS 2.6.3, and BerkeleyDB 4.6.21
Note: I have a "non standard" config. My workstation is an OCS 5 installer and my nodes are VMs. The following is also exported (from installer to compute nodes), with the no_root_squash option : /home ; /VM/mbozzore
Of course, if you go through the same process, you will experience the fact that PVFS 2.7.x and 2.6.x are slightly different (for example, 2 arguments are needed for pvfs2-genconfig (only one for version 2.7.1))
For the config part, I used:
--------------------------------------------------------
[mbozzore@stakhanov pvfs-2.6.3]$ ./configure --prefix=/home/mbozzore/pvfs-2.6.3 --with-kernel=/usr/src/kernels/2.6.18-53.el5-x86_64/ --enable-shared --with-db=/home/mbozzore/db-4.6.21/
--------------------------------------------------------
And again:
###############################################
***** Displaying PVFS2 Configuration Information *****
------------------------------------------------------
PVFS2 configured to build karma gui : no
PVFS2 configured to use epoll : yes
PVFS2 configured to perform coverage analysis : no
PVFS2 configured for aio threaded callbacks : yes
PVFS2 configured for the 2.6.x kernel module : yes
PVFS2 configured for the 2.4.x kernel module : no
PVFS2 configured for using the mmap-ra-cache : no
PVFS2 configured for using trusted connections : no
PVFS2 configured for a thread-safe client library : yes
PVFS2 will use workaround for redhat 2.4 kernels : no
PVFS2 will use workaround for buggy NPTL : no
PVFS2 server will be built : yes
###############################################
As usual:
--------------------------------------------------------
make
make kmod
make install
--------------------------------------------------------
Note that by default make install will install pvfs2-server in the sbin directory, but no client at all (actually you need 2 binaries for the client part).
On the PVFS side, I'd like to use the following configuration:
--------------------------------------------------------
compute-00-00, compute-00-01 : IO nodes
compute-00-02 : metadata server
compute-00-03 : client only (compute-00-00, compute-00-01, compute-00-02 also clients)
--------------------------------------------------------
In order to achieve that, I need to generate the config files, using the pvfs2-genconfig script; it can be used interactively:
###############################################
[root@stakhanov ~]# /home/mbozzore/pvfs-2.6.3/bin/pvfs2-genconfig /etc/pvfs2/pvfs2-fs.conf /etc/pvfs2/pvfs2-server.conf
************************************************** ********************
Welcome to the PVFS2 Configuration Generator:
This interactive script will generate configuration files suitable
for use with a new PVFS2 file system. Please see the PVFS2 quickstart
guide for details.
************************************************** ********************
You must first select the network protocol that your file system will use.
The only currently supported options are "tcp", "gm", and "ib".
(For multi-homed configurations, use e.g. "ib,tcp".)
* Enter protocol type [Default is tcp]:
Choose a TCP/IP port for the servers to listen on. Note that this
script assumes that all servers will use the same port number.
* Enter port number [Default is 3334]:
Choose a directory for each server to store data in.
* Enter directory name: [Default is /pvfs2-storage-space]:
Choose a file for each server to write log messages to.
* Enter log file location [Default is /tmp/pvfs2-server.log]: /var/log/pvfs2-server.log
Next you must list the hostnames of the machines that will act as
I/O servers. Acceptable syntax is "node1, node2, ..." or "node{#-#,#,#}".
* Enter hostnames [Default is localhost]: compute-00-00, compute-00-01
Now list the hostnames of the machines that will act as Metadata
servers. This list may or may not overlap with the I/O server list.
* Enter hostnames [Default is localhost]: compute-00-02
Configured a total of 3 servers:
2 of them are I/O servers.
1 of them are Metadata servers.
* Would you like to verify server list (y/n) [Default is n]? y
****** I/O servers:
compute-00-01
compute-00-00
****** Metadata servers:
compute-00-02
* Does this look ok (y/n) [Default is y]?
Writing fs config file... Done.
Writing 3 server config file(s)... Done.
###############################################
So, the good news is : multi-homed configs are supported:
###############################################
You must first select the network protocol that your file system will use.
The only currently supported options are "tcp", "gm", and "ib".
(For multi-homed configurations, use e.g. "ib,tcp".)
###############################################
Looks great, will try it later (IB).
The script did generate few files (you will not get the same number of files with PVFS 2.7.1)
The files generated are:
--------------------------------------------------------
[root@stakhanov ~]# ls /etc/pvfs2/
pvfs2-fs.conf pvfs2-server.conf-compute-00-00 pvfs2-server.conf-compute-00-01 pvfs2-server.conf-compute-00-02
--------------------------------------------------------
Basically, one main config file and one file per server (IO or Metadata)
So, what's inside ?
###############################################
[root@stakhanov ~]# cat /etc/pvfs2/pvfs2-fs.conf
<Defaults>
UnexpectedRequests 50
EventLogging none
LogStamp datetime
BMIModules bmi_tcp
FlowModules flowproto_multiqueue
PerfUpdateInterval 1000
ServerJobBMITimeoutSecs 30
ServerJobFlowTimeoutSecs 30
ClientJobBMITimeoutSecs 300
ClientJobFlowTimeoutSecs 300
ClientRetryLimit 5
ClientRetryDelayMilliSecs 2000
</Defaults>
<Aliases>
Alias compute-00-00 tcp://compute-00-00:3334
Alias compute-00-01 tcp://compute-00-01:3334
Alias compute-00-02 tcp://compute-00-02:3334
</Aliases>
<Filesystem>
Name pvfs2-fs
ID 398972398
RootHandle 1048576
<MetaHandleRanges>
Range compute-00-02 4-1431655767
</MetaHandleRanges>
<DataHandleRanges>
Range compute-00-00 1431655768-2863311531
Range compute-00-01 2863311532-4294967295
</DataHandleRanges>
<StorageHints>
TroveSyncMeta yes
TroveSyncData no
</StorageHints>
</Filesystem>
[root@stakhanov ~]# cat /etc/pvfs2/pvfs2-server.conf-compute-00-00
StorageSpace /pvfs2-storage-space
HostID "tcp://compute-00-00:3334"
LogFile /var/log/pvfs2-server.log
[root@stakhanov ~]# cat /etc/pvfs2/pvfs2-server.conf-compute-00-01
StorageSpace /pvfs2-storage-space
HostID "tcp://compute-00-01:3334"
LogFile /var/log/pvfs2-server.log
[root@stakhanov ~]# cat /etc/pvfs2/pvfs2-server.conf-compute-00-02
StorageSpace /pvfs2-storage-space
HostID "tcp://compute-00-02:3334"
LogFile /var/log/pvfs2-server.log
###############################################
Wow, I just got the following when trying to get a preview of my post:
The following errors occurred with your submission:
The text that you have entered is too long (20231 characters). Please shorten it to 20000 characters long.
I knew it: no karma at all ;)
Mehdi Bozzo-Rey
Total Comments 0
Comments
Total Trackbacks 0
Trackbacks
Recent Blog Entries by mehdi
- PVFS version 2, first try, part 2 (August 22nd, 2008)
- PVFS version 2, first try (August 22nd, 2008)
- LAVA, Open MPI, Infiniband (OFED) and ... RLIMIT_MEMLOCK (August 21st, 2008)








