HPC, calcolo distribuito e applicazioni scientifiche
Giovanni Erbacci
Gruppo Supercalcolo – CINECA
[email protected]
Corso di formazione
Calcolo Parallelo ad Alte Prestazioni
Catania 29 settembre – 6 ottobre 2008
www.cineca.it
Argomenti
Scienze computazionali
Evoluzione dei sistemi HPC
 Processori
 Interconnessione
Infrastrutture di Supercalcolo e di Grid
La sfida del software
© CINECA 2008
Giovanni Erbacci
2
Scienze Computazionali
Computational science together with theory and experimentation, now
constitutes the “third pillar” of scientific inquiry, enabling researchers to build
and test models of complex phenomena
Trovano pieno compimento le visioni di:
The Nobel Prize in Chemistry 1998
Owen Willans Richardson, anni ’20
Nobel Prize in Physics 1928
"for his development of
the density-functional
theory"
"for his development of
computational methods in
quantum chemistry"
for his work on the thermionic
phenomenon and especially for
the discovery of the law named
after him"
John Von Neumann, anni ’40
Kenneth. Wilson, anni ’80
Nobel Prize in Physics 1982
Walter Kohn
© CINECA 2008
John A. Pople
Giovanni Erbacci
"for his theory for critical
phenomena in connection
with phase transitions"
3
Grand Challenges: oggi
- Simulazione di sistemi ingegneristici completi
- Simulazione di sistemi biologici completi
- Astrofisica
- Scienza dei materiali
Bio-informatica, proteomica, farmaco-genetica
Progettazione di nuovi materiali, atomo per atomo
Miniaturizzazione dei dispositivi fino al livello quantistico
Modelli funzionali 3D scientificamente accurati del corpo umano
Biocomplessità e Biodiversità
Ricerca Atmosferica e Clima
Digital libraries per la Scienza e l’ingegneria
Generate new knowledge that crosses traditional
disciplinary boundaries.
Computational science plays a fundamental role
in addressing the 21st century’s most important
problems:
- predominantly multidisciplinary,
- multisector
- and collaborative problems
© CINECA 2008
Giovanni Erbacci
4
How Has Innovation Changed?
•Instantaneous communication
•Geographically distributed work
•Increased productivity
•More data everywhere
•Increasing problem complexity
•Innovation happens worldwide
More data everywhere: Radar, satellites, CAT scans, weather
models, the human genome.
The size and resolution of the problems scientists address today
are limited only by the size of the data they can reasonably
work with. There is a constantly increasing demand for faster
processing on bigger data.
Increasing problem complexity: Partly driven by the ability to
handle bigger data, but also by the requirements and
opportunities brought by new technologies. For example, new
kinds of medical scans create new computational challenges.
© CINECA 2008
Giovanni Erbacci
5
Increasing Complexity
As technology allows customers to handle bigger datasets and
faster computations, they push to solve harder problems.
In turn, the new class of problems drives the next cycle of
technology innovation.
Crash testing example:
Originally: crash a car and observe what happens to the test dummy.
Next level: save the cost of cars and dummies and do some of that testing through simulation.
Now: need to determine what happens to the capillaries around your lung when the air bag goes off.
Crash dummy
© CINECA 2008
E-crash dummy
Organ damage
Giovanni Erbacci
6
Biosciences: NFkB Transcription Factor (blue) with DNA (orange) structure.
13 ns molecular dynamics simulation of this complex performed using NAMD
© CINECA 2008
Giovanni Erbacci
7
Gas and stars distribution in cluster of galaxies from a cosmological simulation
© CINECA 2008
Giovanni Erbacci
8
Vesuvio sub Plinian eruption
© CINECA 2008
Giovanni Erbacci
9
High Performance Computing: Oggi
Tera Flops
Maggior integrazione:
• Integrazione logica, non
necessariamente fisica.
• Grid Computing
compute
data
Tera Byte
•
•
•
Big Power
Big Data
Big Insight
© CINECA 2008
visualise
Tera Pixel
Training
Co-operation
Maintenance
E’ impossibile mantenere la
competitività senza continui
sforzi innovativi!
Focalizzare sulle applicazioni
Giovanni Erbacci
10
Top 500: trend delle performace
© CINECA 2008
Giovanni Erbacci
11
I primi supercomputer nella top 500
http://www.top500.org/
N
Site
System
Cores
BladeCenter QS22/LS21 Cluster, PowerXCell 8i 3.2 Ghz
/ Opteron DC 1.8 GHz , Voltaire Infiniband
IBM
122400
1026
1375.78
eServer Blue Gene Solution
IBM
212992
478.2
596.38
Argonne National
3 Laboratory
United States
Blue Gene/P Solution
IBM
163840
450.3
557.06
Texas Advanced Computing
4 Center/Univ. of Texas
United States
SunBlade x6420, Opteron Quad 2Ghz, Infiniband
Sun Microsystems
62976
326
503.81
DOE/Oak Ridge National
5 Laboratory
United States
Cray XT4 QuadCore 2.1 GHz
Cray Inc.
30976
205
260.2
Blue Gene/P Solution
IBM
65536
180
222.82
DOE/NNSA/LANL
1
United States
2
DOE/NNSA/LLNL
United States
Forschungszentrum Juelich
(FZJ)
6 Germany
Giugno 2009
© CINECA 2008
Giovanni Erbacci
Rmax
Rpeak
12
Some facts from the last Top 500 list
June ‘09
1976
Cary 1 Installato a Los Alamos: peak performance 160 MegaFlop/s (106 flop/s)
1993
(1° Edizione Top 500) N. 1
1997
Teraflop/s barrier (1012 flop/s)
2008
Petaflop/s (1015 flop/s)
59.7 GigaFlop/s (1012 flop/s)
Roadrunner
sistema ibrido che connette 6562 processori dual-core AMD Opteron integrati con 12240
processori IBM Cell (disegnati in origine per piattaforme per video game come la
Playstation 3 Sony) che fungono da acceleratori.
L’intero sistema si compone di 122400 cores e 98 TeraByte di RAM
interconnessi via Infiniband 4x DDR.
© CINECA 2008
Giovanni Erbacci
13
25 IBM CBC 5120 core
70 IBM BCC 1024 core
60 IBM CLX 512pe
100
62
30
36
63
131
136
109
109
88
30 IBM SP4 512pe
47
65
36 Cray T3E 1200 256pe
38
29 Cray T3E 1200 128pe
70
57 Cray T3E 128pe
159 Cray T3D 128pe
168
209
266
353
40
1
35
1
30
1
25
1
285 Cray C90 2pe
226 Cray Ymp 4pe
20
1
15
1
10
1
51
1
132 Cray T3D 64pe
Ranking del CINECA nella Top500
93 93 94 94 95
95 96 96 97 97 98
n
v
n
98 99 99 00 00 01 01
J u No J u Nov J un ov un ov un
v
n
v
n
01 02 02 03 03 04
N
J
J
N
04 05 05 s t.
No J u No J u Nov J un ov u n u n ov un
v
n
t.
v
n
J
J
N
J
N
No J u No J u Nov J un ov 6 e 6 es
N
0
0
n
v
J u No
© CINECA 2008
Giovanni Erbacci
14
Top 500 in Italia
T op500 - P erformanc e dei S is temi
1.000.000,00
100.000,00
P erformanc e (G F lops )
10.000,00
1.000,00
100,00
10,00
1,00
1s t
© CINECA 2008
500th
C inec a top
1s t - Trend L ine
500th - Trend L ine
Giovanni Erbacci
07
20
06
20
05
20
04
20
03
20
02
20
01
20
00
20
99
19
98
19
97
19
96
19
95
19
94
19
19
93
0,10
C inec a - Trend L ine
15
Capability Computing: Trend per Country
1000
Blue Gene P
LLNL Blue Gene L
Hector +
FZJ
Earth Simulator+
France Ministry
Hector
100
DOE BGL
LRZ
France Ministry
Earth Simulator
CINECA
Tera Flop
LRZ
ASCII Q
10
HLRS
HPCx2
ASCII White
CEA
FZJ
Mare Nostrum
CINECA
HPCx3
CEA
HPCx
Tokio U.
KEK
1
CINECA
RZG
DWD
USA
Japan
CINECA
LRZ
UK
Galizia U.
CSAR
Germany
CNES
France
Meteo France
Spain
CEPBA
CINECA
Italy
0,1
2000
© CINECA 2008
2001
2002
2003
2004
2005
Giovanni Erbacci
2006
2007
2008
16
Microprocessor-based Architectures
Instruction Level Parallelism (ILP): systems not really serial
 Deeply pipelined processors
 Multiple functional units
Processor taxonomy
 Out of order superscalar: Alpha, Pentium 4, Opteron
 hardware dynamically schedules instructions:
determine dependences and dispatch instructions
 many instructions can be in flight at once
 VLIW: Itanium
 issue a fixed size “bundle” of instructions each cycle
 bundles tailored to mix of available functional units
 compiler pre-determines what instructions execute in
parallel
Complex memory hierarchy
 Non-blocking, multi-level caches
 TLB (Translation Lookaside Buffer)
© CINECA 2008
Giovanni Erbacci
17
Commodiy Interconnects
Gig Ethernet
Myrinet
Infiniband
QsNet
SCI
© CINECA 2008
Giovanni Erbacci
18
Real Crisis with HPC is with the Software
Programming is stuck
 Arguably hasn’t changed since the 70’s
It’s time for a change
 Complexity is rising dramatically
 highly parallel and distributed systems
 From 10 to 100 to 1000 to 10000 to 100000 of processors!!
 multidisciplinary applications
A supercomputer application and software are usually much more longlived than a hardware
 Hardware life typically five years at most.
 Fortran and C are the main programming models
Software is a major cost component of modern technologies.
 The tradition in HPC system procurement is to assume that the
software is free.
© CINECA 2008
Giovanni Erbacci
19
Some Current Unmet Needs
Performance / Portability
Fault tolerance
Better programming models
 Global shared address space
 Visible locality
Maybe coming soon (since incremental, yet offering real benefits):
 Global Address Space (GAS) languages: UPC, Co-Array
Fortran, Titanium)
 “Minor” extensions to existing languages
 More convenient than MPI
 Have performance transparency via explicit remote memory
references
The critical cycle of prototyping, assessment, and commercialization must be a
long-term, sustaining investment, not a one time, crash program.
© CINECA 2008
Giovanni Erbacci
20
Iniziative europee
 Grid Computing: EGEE
 HPC: DEISA
 Aceess HPC:
 HPC-EUROPA HPC: PRACE
© CINECA 2008
Giovanni Erbacci
21
DEISA Consortium
DEISA start on 1st of May 2004 with eight partners (CSC, CINECA,
EPCC, ECMWF, FZJ, IDRIS, RZG, SARA)
Three new partners joined (BSC, HLRS, LRZ)
eDEISA Start on 1st of June 2005
DEISA End on 30th of April 2008
eDEISA End on 31st of May 2008
Start of DEISA2 on 1st of May 2008
© CINECA 2008
Giovanni Erbacci
22
DEISA Objectives
To enable Europe’s terascale science by
the integration of Europe’s most
powerful supercomputing systems.
Enabling scientific discovery across a
broad spectrum of science and
technology is the only criterion for
success
DEISA is an European Supercomputing
Service built on top of existing national
services.
DEISA deploys and operates a
persistent, production quality, distributed
supercomputing environment with
continental scope.
© CINECA 2008
Giovanni Erbacci
23
DEISA Extreme Computing Initiative: DECI
This initiative consists in the identification, the deployment and operation of a
small number of “flagship” applications of the project, in all areas of science
and technology.
These leading, ground breaking applications must deal with complex,
demanding, innovative simulations that would not be possible without the
DEISA infrastructure, and which would benefit – if accepted – from the
exceptional resources from the Consortium.
Projects of this kind will be chosen on the basis of innovation potential,
scientific excellence and relevance criteria. They will require an excellence
label from the national evaluation committees of at least two partner
organizations.
DEISA 2:
May 2008 – April 2011
© CINECA 2008
Giovanni Erbacci
24
Grid computing e Scienze
Computazionali: HPC-Europa
HPC-Europa is a consortium of six leading HPC
infrastructures and five centres of excellence aiming at the
integrated provision of advanced computational services to
the European research community working at the forefront
of science. The services will be delivered at a large
spectrum both in terms of access to HPC systems and
provision of a suitable computational environment to allow
the European researchers to remai competitive with teams
elsewhere in the world. Moreover, Joint Research and
Networking actions will contribute to foster a culture of
cooperation to generate critical mass for computational.
© CINECA 2008
Giovanni Erbacci
25
© CINECA 2008
Giovanni Erbacci
26
HPC-Europa TA Sites
TA Management activity: EPCC
Participants and Roles







CINECA
EPCC
BSC
HLSR
CNES
SARA
SCS
© CINECA 2008
Giovanni Erbacci
27
TA Service Provision Model
Who can ask for access grant ?
 European researcher who need HPC resources and
collaborating with local researchers in their research
field and experts in supercomputing support
How long can be the visit ?
 From one to twelve weeks
How to apply ?
 Continuous Call for proposals over 4 years
 Evaluations are every 3 months
 International Scientific Selection Panel
© CINECA 2008
Giovanni Erbacci
28
PRACE
Partnership for Advanced Computing in Europe
Partnership at State Member Level
Italy represented by CINECA delegated by the MUR
in collaboration with CNR / INFM
€ 10 Mio EU funded project to define the implementation plan
State Member in kind funding for the computing infrastructure
implementation, EU co found the transnational access
© CINECA 2008
Giovanni Erbacci
29
PRACE: Workplane outline
 Preparation of the RI as a single legal entity
 Legal form and governance structure, funding, procurement, and usage
strategy, Peer Review process
HPC Ecosystem links: European and national HPC infrastructures e.g. DEISA,
HPC-Europa, the ESFRI projects, EGEE and EGI, communities, vendors and user
industries, …
Prepare operation of petascale systems in 2009/2010
Deployment and benchmarking of prototypes
Porting, optimising, petascaling of applications
Start a process of technology development and assessment for future multipetascale systems
PRACE will cooperate with other EU projects in these areas
Utilising existing technologies e.g. from DEISA
Confirmed: DEISA, HPC-Europa, EGI
© CINECA 2008
Giovanni Erbacci
30
© CINECA 2008
Giovanni Erbacci
31
Conclusion: Hints
“From Molecule to Man”
P. Sloot, Virolab hpc 2008 Cetraro
“Serial computing is dead, and the parallel computing revolution has begun:
Are you part of the solution, or part of the problem?”
Dave Patterson, UC Berkeley, Usenix conference June 2008
State of the art HPC infrastructures are fundamental to support scientific
research and to advance science at European level.
© CINECA 2008
Giovanni Erbacci
32