HPC, calcolo distribuito e applicazioni scientifiche Giovanni Erbacci Gruppo Supercalcolo – CINECA [email protected] Corso di formazione Calcolo Parallelo ad Alte Prestazioni Catania 29 settembre – 6 ottobre 2008 www.cineca.it Argomenti Scienze computazionali Evoluzione dei sistemi HPC Processori Interconnessione Infrastrutture di Supercalcolo e di Grid La sfida del software © CINECA 2008 Giovanni Erbacci 2 Scienze Computazionali Computational science together with theory and experimentation, now constitutes the “third pillar” of scientific inquiry, enabling researchers to build and test models of complex phenomena Trovano pieno compimento le visioni di: The Nobel Prize in Chemistry 1998 Owen Willans Richardson, anni ’20 Nobel Prize in Physics 1928 "for his development of the density-functional theory" "for his development of computational methods in quantum chemistry" for his work on the thermionic phenomenon and especially for the discovery of the law named after him" John Von Neumann, anni ’40 Kenneth. Wilson, anni ’80 Nobel Prize in Physics 1982 Walter Kohn © CINECA 2008 John A. Pople Giovanni Erbacci "for his theory for critical phenomena in connection with phase transitions" 3 Grand Challenges: oggi - Simulazione di sistemi ingegneristici completi - Simulazione di sistemi biologici completi - Astrofisica - Scienza dei materiali Bio-informatica, proteomica, farmaco-genetica Progettazione di nuovi materiali, atomo per atomo Miniaturizzazione dei dispositivi fino al livello quantistico Modelli funzionali 3D scientificamente accurati del corpo umano Biocomplessità e Biodiversità Ricerca Atmosferica e Clima Digital libraries per la Scienza e l’ingegneria Generate new knowledge that crosses traditional disciplinary boundaries. Computational science plays a fundamental role in addressing the 21st century’s most important problems: - predominantly multidisciplinary, - multisector - and collaborative problems © CINECA 2008 Giovanni Erbacci 4 How Has Innovation Changed? •Instantaneous communication •Geographically distributed work •Increased productivity •More data everywhere •Increasing problem complexity •Innovation happens worldwide More data everywhere: Radar, satellites, CAT scans, weather models, the human genome. The size and resolution of the problems scientists address today are limited only by the size of the data they can reasonably work with. There is a constantly increasing demand for faster processing on bigger data. Increasing problem complexity: Partly driven by the ability to handle bigger data, but also by the requirements and opportunities brought by new technologies. For example, new kinds of medical scans create new computational challenges. © CINECA 2008 Giovanni Erbacci 5 Increasing Complexity As technology allows customers to handle bigger datasets and faster computations, they push to solve harder problems. In turn, the new class of problems drives the next cycle of technology innovation. Crash testing example: Originally: crash a car and observe what happens to the test dummy. Next level: save the cost of cars and dummies and do some of that testing through simulation. Now: need to determine what happens to the capillaries around your lung when the air bag goes off. Crash dummy © CINECA 2008 E-crash dummy Organ damage Giovanni Erbacci 6 Biosciences: NFkB Transcription Factor (blue) with DNA (orange) structure. 13 ns molecular dynamics simulation of this complex performed using NAMD © CINECA 2008 Giovanni Erbacci 7 Gas and stars distribution in cluster of galaxies from a cosmological simulation © CINECA 2008 Giovanni Erbacci 8 Vesuvio sub Plinian eruption © CINECA 2008 Giovanni Erbacci 9 High Performance Computing: Oggi Tera Flops Maggior integrazione: • Integrazione logica, non necessariamente fisica. • Grid Computing compute data Tera Byte • • • Big Power Big Data Big Insight © CINECA 2008 visualise Tera Pixel Training Co-operation Maintenance E’ impossibile mantenere la competitività senza continui sforzi innovativi! Focalizzare sulle applicazioni Giovanni Erbacci 10 Top 500: trend delle performace © CINECA 2008 Giovanni Erbacci 11 I primi supercomputer nella top 500 http://www.top500.org/ N Site System Cores BladeCenter QS22/LS21 Cluster, PowerXCell 8i 3.2 Ghz / Opteron DC 1.8 GHz , Voltaire Infiniband IBM 122400 1026 1375.78 eServer Blue Gene Solution IBM 212992 478.2 596.38 Argonne National 3 Laboratory United States Blue Gene/P Solution IBM 163840 450.3 557.06 Texas Advanced Computing 4 Center/Univ. of Texas United States SunBlade x6420, Opteron Quad 2Ghz, Infiniband Sun Microsystems 62976 326 503.81 DOE/Oak Ridge National 5 Laboratory United States Cray XT4 QuadCore 2.1 GHz Cray Inc. 30976 205 260.2 Blue Gene/P Solution IBM 65536 180 222.82 DOE/NNSA/LANL 1 United States 2 DOE/NNSA/LLNL United States Forschungszentrum Juelich (FZJ) 6 Germany Giugno 2009 © CINECA 2008 Giovanni Erbacci Rmax Rpeak 12 Some facts from the last Top 500 list June ‘09 1976 Cary 1 Installato a Los Alamos: peak performance 160 MegaFlop/s (106 flop/s) 1993 (1° Edizione Top 500) N. 1 1997 Teraflop/s barrier (1012 flop/s) 2008 Petaflop/s (1015 flop/s) 59.7 GigaFlop/s (1012 flop/s) Roadrunner sistema ibrido che connette 6562 processori dual-core AMD Opteron integrati con 12240 processori IBM Cell (disegnati in origine per piattaforme per video game come la Playstation 3 Sony) che fungono da acceleratori. L’intero sistema si compone di 122400 cores e 98 TeraByte di RAM interconnessi via Infiniband 4x DDR. © CINECA 2008 Giovanni Erbacci 13 25 IBM CBC 5120 core 70 IBM BCC 1024 core 60 IBM CLX 512pe 100 62 30 36 63 131 136 109 109 88 30 IBM SP4 512pe 47 65 36 Cray T3E 1200 256pe 38 29 Cray T3E 1200 128pe 70 57 Cray T3E 128pe 159 Cray T3D 128pe 168 209 266 353 40 1 35 1 30 1 25 1 285 Cray C90 2pe 226 Cray Ymp 4pe 20 1 15 1 10 1 51 1 132 Cray T3D 64pe Ranking del CINECA nella Top500 93 93 94 94 95 95 96 96 97 97 98 n v n 98 99 99 00 00 01 01 J u No J u Nov J un ov un ov un v n v n 01 02 02 03 03 04 N J J N 04 05 05 s t. No J u No J u Nov J un ov u n u n ov un v n t. v n J J N J N No J u No J u Nov J un ov 6 e 6 es N 0 0 n v J u No © CINECA 2008 Giovanni Erbacci 14 Top 500 in Italia T op500 - P erformanc e dei S is temi 1.000.000,00 100.000,00 P erformanc e (G F lops ) 10.000,00 1.000,00 100,00 10,00 1,00 1s t © CINECA 2008 500th C inec a top 1s t - Trend L ine 500th - Trend L ine Giovanni Erbacci 07 20 06 20 05 20 04 20 03 20 02 20 01 20 00 20 99 19 98 19 97 19 96 19 95 19 94 19 19 93 0,10 C inec a - Trend L ine 15 Capability Computing: Trend per Country 1000 Blue Gene P LLNL Blue Gene L Hector + FZJ Earth Simulator+ France Ministry Hector 100 DOE BGL LRZ France Ministry Earth Simulator CINECA Tera Flop LRZ ASCII Q 10 HLRS HPCx2 ASCII White CEA FZJ Mare Nostrum CINECA HPCx3 CEA HPCx Tokio U. KEK 1 CINECA RZG DWD USA Japan CINECA LRZ UK Galizia U. CSAR Germany CNES France Meteo France Spain CEPBA CINECA Italy 0,1 2000 © CINECA 2008 2001 2002 2003 2004 2005 Giovanni Erbacci 2006 2007 2008 16 Microprocessor-based Architectures Instruction Level Parallelism (ILP): systems not really serial Deeply pipelined processors Multiple functional units Processor taxonomy Out of order superscalar: Alpha, Pentium 4, Opteron hardware dynamically schedules instructions: determine dependences and dispatch instructions many instructions can be in flight at once VLIW: Itanium issue a fixed size “bundle” of instructions each cycle bundles tailored to mix of available functional units compiler pre-determines what instructions execute in parallel Complex memory hierarchy Non-blocking, multi-level caches TLB (Translation Lookaside Buffer) © CINECA 2008 Giovanni Erbacci 17 Commodiy Interconnects Gig Ethernet Myrinet Infiniband QsNet SCI © CINECA 2008 Giovanni Erbacci 18 Real Crisis with HPC is with the Software Programming is stuck Arguably hasn’t changed since the 70’s It’s time for a change Complexity is rising dramatically highly parallel and distributed systems From 10 to 100 to 1000 to 10000 to 100000 of processors!! multidisciplinary applications A supercomputer application and software are usually much more longlived than a hardware Hardware life typically five years at most. Fortran and C are the main programming models Software is a major cost component of modern technologies. The tradition in HPC system procurement is to assume that the software is free. © CINECA 2008 Giovanni Erbacci 19 Some Current Unmet Needs Performance / Portability Fault tolerance Better programming models Global shared address space Visible locality Maybe coming soon (since incremental, yet offering real benefits): Global Address Space (GAS) languages: UPC, Co-Array Fortran, Titanium) “Minor” extensions to existing languages More convenient than MPI Have performance transparency via explicit remote memory references The critical cycle of prototyping, assessment, and commercialization must be a long-term, sustaining investment, not a one time, crash program. © CINECA 2008 Giovanni Erbacci 20 Iniziative europee Grid Computing: EGEE HPC: DEISA Aceess HPC: HPC-EUROPA HPC: PRACE © CINECA 2008 Giovanni Erbacci 21 DEISA Consortium DEISA start on 1st of May 2004 with eight partners (CSC, CINECA, EPCC, ECMWF, FZJ, IDRIS, RZG, SARA) Three new partners joined (BSC, HLRS, LRZ) eDEISA Start on 1st of June 2005 DEISA End on 30th of April 2008 eDEISA End on 31st of May 2008 Start of DEISA2 on 1st of May 2008 © CINECA 2008 Giovanni Erbacci 22 DEISA Objectives To enable Europe’s terascale science by the integration of Europe’s most powerful supercomputing systems. Enabling scientific discovery across a broad spectrum of science and technology is the only criterion for success DEISA is an European Supercomputing Service built on top of existing national services. DEISA deploys and operates a persistent, production quality, distributed supercomputing environment with continental scope. © CINECA 2008 Giovanni Erbacci 23 DEISA Extreme Computing Initiative: DECI This initiative consists in the identification, the deployment and operation of a small number of “flagship” applications of the project, in all areas of science and technology. These leading, ground breaking applications must deal with complex, demanding, innovative simulations that would not be possible without the DEISA infrastructure, and which would benefit – if accepted – from the exceptional resources from the Consortium. Projects of this kind will be chosen on the basis of innovation potential, scientific excellence and relevance criteria. They will require an excellence label from the national evaluation committees of at least two partner organizations. DEISA 2: May 2008 – April 2011 © CINECA 2008 Giovanni Erbacci 24 Grid computing e Scienze Computazionali: HPC-Europa HPC-Europa is a consortium of six leading HPC infrastructures and five centres of excellence aiming at the integrated provision of advanced computational services to the European research community working at the forefront of science. The services will be delivered at a large spectrum both in terms of access to HPC systems and provision of a suitable computational environment to allow the European researchers to remai competitive with teams elsewhere in the world. Moreover, Joint Research and Networking actions will contribute to foster a culture of cooperation to generate critical mass for computational. © CINECA 2008 Giovanni Erbacci 25 © CINECA 2008 Giovanni Erbacci 26 HPC-Europa TA Sites TA Management activity: EPCC Participants and Roles CINECA EPCC BSC HLSR CNES SARA SCS © CINECA 2008 Giovanni Erbacci 27 TA Service Provision Model Who can ask for access grant ? European researcher who need HPC resources and collaborating with local researchers in their research field and experts in supercomputing support How long can be the visit ? From one to twelve weeks How to apply ? Continuous Call for proposals over 4 years Evaluations are every 3 months International Scientific Selection Panel © CINECA 2008 Giovanni Erbacci 28 PRACE Partnership for Advanced Computing in Europe Partnership at State Member Level Italy represented by CINECA delegated by the MUR in collaboration with CNR / INFM € 10 Mio EU funded project to define the implementation plan State Member in kind funding for the computing infrastructure implementation, EU co found the transnational access © CINECA 2008 Giovanni Erbacci 29 PRACE: Workplane outline Preparation of the RI as a single legal entity Legal form and governance structure, funding, procurement, and usage strategy, Peer Review process HPC Ecosystem links: European and national HPC infrastructures e.g. DEISA, HPC-Europa, the ESFRI projects, EGEE and EGI, communities, vendors and user industries, … Prepare operation of petascale systems in 2009/2010 Deployment and benchmarking of prototypes Porting, optimising, petascaling of applications Start a process of technology development and assessment for future multipetascale systems PRACE will cooperate with other EU projects in these areas Utilising existing technologies e.g. from DEISA Confirmed: DEISA, HPC-Europa, EGI © CINECA 2008 Giovanni Erbacci 30 © CINECA 2008 Giovanni Erbacci 31 Conclusion: Hints “From Molecule to Man” P. Sloot, Virolab hpc 2008 Cetraro “Serial computing is dead, and the parallel computing revolution has begun: Are you part of the solution, or part of the problem?” Dave Patterson, UC Berkeley, Usenix conference June 2008 State of the art HPC infrastructures are fundamental to support scientific research and to advance science at European level. © CINECA 2008 Giovanni Erbacci 32