Linear Algebra Computation on a Model Grid

S. Petersburg, ICCS 2003
Linear Algebra Computation
on a Model Grid Platform
Carlo Manuali
[email protected]
Centro d’Ateneo per i Servizi Informatici (C.A.S.I.)
University of Perugia, Italy
in collaboration with:
Loriano Storchi
Osvaldo Gervasi
Giuseppe Vitillaro
Antonio Laganà
Francesco Tarantelli
Carlo Manuali – [email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
Centro d’Ateneo per i Servizi Informatici (C.A.S.I.) – University of Perugia, Italy
Summary
1. Objective
Customization of Globus Software Toolkit 2 for a Grid
infrastructure based on Beowulf clusters
2. Contents
a)
b)
c)
d)
The platform topology
The multilevel process communication strategy
Globus, MPI
Communication Benchmarks, Computational Tests
01 Carlo Manuali – ICCS 2003
Linear Algebra Computation Benchmarks on a Model Grid Platform
The Grid Computing
Grid
Computing
02 Carlo Manuali – ICCS 2003
Linear Algebra Computation Benchmarks on a Model Grid Platform
The Globus Toolkit 2
http://www.globus.org
03 Carlo Manuali – ICCS 2003
Linear Algebra Computation Benchmarks on a Model Grid Platform
A model computing Grid
ü Centralized installation of the Globus software into
a NFS shared directory
ü Implementation of Globus and MPICH-G2 for Grid
management
ü Modification of the LAM/MPI broadcast
implementation
04 Carlo Manuali – ICCS 2003
Linear Algebra Computation Benchmarks on a Model Grid Platform
A model computing Grid
LAN 100 Mb
ATM WAN 16 Mb
05 Carlo Manuali – ICCS 2003
Linear Algebra Computation Benchmarks on a Model Grid Platform
A model computing Grid
a dedicate node called front-end
for each cluster
• NIS, NFS and automount service
• /usr/local is exported via NFS
• All nodes access Globus in /usr/local/globus
06 Carlo Manuali – ICCS 2003
Linear Algebra Computation Benchmarks on a Model Grid Platform
A MDS Centralized Installation
• MDS informations in the custom directory
/usr/local/globus/etc/nodes
• Customization of SXXgris command
export sysconfdir=/usr/local/globus/etc/nodes/‘hostname‘
• grid-info-site-policy.conf
policydata:
(&(Mds-Service-hn=*.IP_domain)(Mds-Service-port=2135))
• grid-info-resource-register.conf
reghn:
hn:
GIIS-server.IP_domain
GRIS-server.IP_domain
07 Carlo Manuali – ICCS 2003
Linear Algebra Computation Benchmarks on a Model Grid Platform
MPICH-G2 and LAM-MPI
• LAM/MPI (version 6.5.6)
• Compilation of Globus Resource Management SDK
with the mpi flavor
08 Carlo Manuali – ICCS 2003
Linear Algebra Computation Benchmarks on a Model Grid Platform
MPICH-G2 and LAM-MPI
• “defines“ which are missing in the include file mpi.h
of LAM/MPI
09 Carlo Manuali – ICCS 2003
Linear Algebra Computation Benchmarks on a Model Grid Platform
MPICH-G2 and LAM-MPI
• The $GLOBUS_GRAM_JOB_MANAGER_MPIRUN variable point
to the following script (mpigrun) which replace the standard
mpirun command
10 Carlo Manuali – ICCS 2003
Linear Algebra Computation Benchmarks on a Model Grid Platform
MPICH-G2 and LAM-MPI
ü
ü
ü
ü
ü
ü
ü
ü
ü
ü
ü
ü
ü
11 Carlo Manuali – ICCS 2003
Linear Algebra Computation Benchmarks on a Model Grid Platform
Topology-aware functions:
broadcast models
MPICH-G2
lv0
TCP-WAN
lv1
TCP-LAN
LAM/MPI
provides for
communication via
TCP/IP among nodes
in a dedicated network
or via shared-memory
lv2
TCP-Intra
machine
lv3
v-MPI
12 Carlo Manuali – ICCS 2003
• two point-to-point communication levels:
- inter-cluster communication (lv0)
- intra-cluster communication (lv3)
Linear Algebra Computation Benchmarks on a Model Grid Platform
Topology-aware functions:
broadcast models
•
Comparison of three different broadcast
methods
(i) The broadcast operation provided by MPICH-G2
(ii) An optimized topology-aware broadcast of our own
implementation
(iii) A no-topology-aware broadcast
13 Carlo Manuali – ICCS 2003
Linear Algebra Computation Benchmarks on a Model Grid Platform
Topology-aware functions:
broadcast models
A typical broadcast:
• HPC:
p0 - p7
• GRID:
p8 - p15
• GIZA:
p16 - p23
At local level:
• LAM/MPI uses non-blocking (asynchronous) send
operations
• We opted for blocking (synchronous) send
operations
14 Carlo Manuali – ICCS 2003
Linear Algebra Computation Benchmarks on a Model Grid Platform
Topology-aware functions:
broadcast models
T=
basic
transmission
time step
• Asynchronous broadcast is completed in 6T
• Our version takes 3T
15 Carlo Manuali – ICCS 2003
Linear Algebra Computation Benchmarks on a Model Grid Platform
Broadcast tests
• ~39 Mb of data
• T = 3.4s
• Link speed between HPC or GRID and GIZA = 0.8 Mb/s
Bcast_Time =
WAN_inter-cluster_Bcast_T + Local_intra-cluster_Bcast_T
WAN_inter-cluster_Bcast_T = ~49s
Local_intra-cluster_Bcast_T = 3T =
Local_intra-cluster_Bcast_T = 6T =
~10s
~20s
(a)
(b)
(a) Optimized broadcast takes about 10 seconds
(b) LAM/MPI broadcast takes about 20 seconds
16 Carlo Manuali – ICCS 2003
Linear Algebra Computation Benchmarks on a Model Grid Platform
Broadcast tests
• Comparison with a no-topology-aware broadcast
1
2
3
In the last one the dominance of the long distance
transfers is evident
17 Carlo Manuali – ICCS 2003
Linear Algebra Computation Benchmarks on a Model Grid Platform
Broadcast tests
7 intra-cluster
data tranfers
16 inter-cluster
data tranfers
18 Carlo Manuali – ICCS 2003
Linear Algebra Computation Benchmarks on a Model Grid Platform
Linear Algebra Benchmarks
• BLAS and
LAPACK at the
local level
• BLACS on top
of MPICH-G2
and PBLAS
• ScaLAPACK
19 Carlo Manuali – ICCS 2003
Linear Algebra Computation Benchmarks on a Model Grid Platform
Linear Algebra Benchmarks
• Tests have been run with PDGEMM
(PBLAS routines)
• Effective speed of 2.5Gflops
(20000 by 20000)
• 70% of performance deterioration exchanging rows with
columns
20 Carlo Manuali – ICCS 2003
Linear Algebra Computation Benchmarks on a Model Grid Platform
MFLOPS
Linear Algebra Benchmarks
• Speed varies
from 2.52 Gflops
to 2.55 Gflops
for block sizes
from 64 (red line)
to 256 (blue line)
• Top speed
reached for a
block size
of 160 (green line)
N
21 Carlo Manuali – ICCS 2003
Linear Algebra Computation Benchmarks on a Model Grid Platform
Conclusion
• Two Globus communication levels using MPI on a model
Grid made up of three workstation cluster
• Comparison of Two-level implementation of broadcast
with a binary-tree
• Parallel linear algebra kernels to exploit the two
communication levels
22 Carlo Manuali – ICCS 2003
Linear Algebra Computation Benchmarks on a Model Grid Platform