Home

New CSC computing resources

image

Contents

1. WA STM 2 bapa A CSC presentation 34 gt Aci CSL taito csc fi SISU CSC TI Your l workstation WRKDIR D LL New tape HOME SARCHIVE in P STMPDIR STMPDIR TMPDIR or 3 P Y W Y a lt NETT F I B Ji gt V f 4 4 TNS LINI OTE TUN ANS NA SUSERAPPL HOME xyz CSC presentation 35 Disks 2 4 PB on DDN New HOME directory on Lustre WRKDIR not backed up soft quota 5 TB S ARCHIVE 1 5 TB user common between Cray and HP Espoo Disk space through IDA 1 PB for Univerisities 1 PB for Finnish Academy SA 1 PB to be shared between SA and ESFRI Additional 3 PB available later on tmp around 2 GB to be used for compiling codes Moving files best practices CSC tar amp bzip first rsync not 0 rsync P username hippul csc fi tmp huge tar gz Blowfish may be faster than AES CPU bottleneck Funet FileSender max 50 GB httos filesender funet fi Files can be downloaded also with wget Coming iRODS batch like process staging e IDA CSC can help to tune e g TCP IP parameters http www csc fi english institutions funet networkservices pert FUNET backbone 10 Gbit s CSC presentation 37 ARCHIVE dos and 5 Dont put small files in SARCHIVE Small files waste capacity Le
2. H CH hs ee en as fe a CA SS ks ks sls Hem Optical uplinks to inter group net CSC presentation 23 Cray environment Typical Cray environment Compilers Cray Intel and GNU Debuggers otalview tokens shared between HP and Cray Cray mpi Cray tuned versions of all usual libraries SLURM Module system similar to Louhi Default shell now bash previously tcsh HP Environment Compilers Intel GNU o MPI libraries Intel mvapich2 OpenMP Batch queue SLURM o New more robust module system Only compatible modules shown with module avail Use module spider to see all Default shell now bash used to be tcsh Disk system changes Core development tools Intel XE Development Tools Compilers C C icc Fortran ifort Cilk Profilers and trace utilities e Vtune Thread checker MPI checker MKL numerical library Intel MPI library only on HP e Cray Application Development Environment GNU Compiler Collection TotalView debugger Performance of numerical libraries 30 00 DGEMM 1000x1000 Single Core Performance 25 00 Opteron Barcelona 20 00 R Sandy Bridge 2 7GHz 2 3GHz Louhi 0 8 7 0 PE B B 7 Ba e 0 00 ATLAS 3 8 ATLAS3 10 ACML5 2 Ifort 12 1 MKL 12 1 LibSci ACML 4 4 0 MKL 11 matmul
3. GFlop s RedHat 6 2 RPM MKL the best choice on Sandy Bridge for now On Cray LibSci will likely be a good alternative CSC presentation 27 Compilers programming Intel B Intel Cluster Studio XE 2013 http software intel com en us intel cluster Studio xe o GNU GNU compilers e g GCC 4 7 2 http gcc gnu org e Intel can be used together with GNU E g gcc or gfortran MKL IntelMPI e mvapich2 MPI library also supported t can be used that Intel or GNU Available applications v Ready Taito Gromacs NAMD Gaussian Turbomole Amber CP2K Elmer VASP Sisu Gromacs GPAW Elmer VASP CSC offers 240 scientific applications Porting them all is a big task Most if not all from Vuori should be available Some installations upon request Do you have priorities Porting strategy At least recompile Legacy binaries may run but not optimally Intel compilers preferred for performance Use Intel MKL or Cray LibSci not ACML o http software intel com sites products mkl Use compiler flags i e xhost O2 includes xAVX Explore optimal thread task placement Intra node and internode Refactor the code if necessary OpenMP MPI workload balance Rewrite any SSE assembler or intrinsics HPC Advisory Council has best practices for many codes http Awww hpcadvisorycouncil com subgroups hpc wo
4. 2 PUE Vision 2015 CSC Pioneer in the 887 Development of ICT Services e the green grid Estimate of Water side Economizer Hours For Data Centers 2009 The Green Grid Number of Available Hours Where Wet Bulb Temperature lt 50F 10C HOURS 7500 7000 6500 6000 5500 5000 4500 4000 3500 3000 2500 P 3 E 2000 za A 7 1 2 x e Tate n ROR Rnranaanantatinn Go presentation angana YHTEYDET 2010 Sodankyla Rovaniemi Se Kokkola Pet Kuopio 7 Vaasa Sein joki eee Jyvaskyla d Tampere A ell La ppeenrant C Hameenlinna e 9 iol Gu Lahti 3 Kouvola d MA e SC Pietari aJ e Turku N Le otka 2 elsin Espoo e GLORIAD mbasha Wa gt RUNNet KUITUVERKKO IP ja kuituverkon liitynt pisteet SIIRTOYHTEYDET O Kuituverkon E NORDUNET liitynt pisteet wv a E X 0 CEP X Caro AES SC Power distribution FinGrid RS E Le LIEKSA ALAPITKZ w M NS Ye VENTUSNEVA o TUOVILA 0 o Sy UIMAHARJU 0 A ONTIOLAHT a KRISTINA MERIPORI Ai N k i OLKILUOTO Ps RAUMA ty 27 ee 5 400 kV 220 kV 110 kV o a CSC presentaffor 10 The machine hall CSC presentation Sisu Cray supercomputer housing fill mal m
5. is up and running quite probably fairly short overlap e Vuori decommission at end of 2013 e Think ahead of data transfers CSC presentation 49 onc OO Oo P V gr e e e e CSC presentat ion Three service models of cloud computing Software Operating systems Computers and networks CSC presentation Example Virtualization in Taito Y Taito cluster two types of nodes HPC and cloud Host OS RHEL Virtual machine Virtual machine Guest OS e Guest OS Ubuntu Windows Traditional HPC vs laaS Traditional HPC environment Cloud environment Virtual Machine Operating system Same for all CSC s cluster OS Chosen by the user Software Done by cluster administrators Installed by the user installation Customers can only install software to their The user has admin rights own directories no administrative rights User accounts Managed by CSC s user administrator Managed by the user Security e g CSC administrators manage the common User has more responsibility software patches software and the OS e g patching of running machines Running jobs Jobs need to be sent via the cluster s Batch The user is free to use or not scheduling System BSS use a BSS Environment Changes to SW libraries compilers happen The user can decide on changes versions onapshot of the Not possible Can save as a Virtual Machine environment image Performance Pe
6. 10101010111 010101 LUT 101010 LOLOL 1010 A qu i W d 90 1 1 1 1 d t exis A Y 8 11555 CSC NA 5 LI A Ux 3 SN KN 6 5 Sei e ef x SN 0 8 B D e a e p at 353 5 D eg a g 3 Ji 3 a Ez o e ed ge e L wh 3 t 7 a F 3 D KA M LEES PN Sai uh Tass Ki XM 7 X SONY xS NDA MM VM R SSAA OWA Atte Sillanpaa Nin d CSC IT Center for Scien SISO Wg AW SSS RR ANY Runeberg M TEN i NS KA Ven x X yi Se ef z a e E m DU Vt mc Outline CSC at a glance New Kajaani Data Centre e Finland s new supercomputers Sisu Cray XC30 aito HP cluster CSC resources available for researchers CSC presentation CSC s Services Funet Services Computing Services Application Services Data Services for Science and Culture Information Management Services 6 c Universities Polytechnics Ministries Public sector Research centers Companies v FUNET Connections to all higher education institutions in Finla
7. New Systems Phase 1 Phase 2 Cray HP Cray HP Deployment December Now Mid 2014 Mid 2014 Intel Sandy Bridge CPU 8 cores 2 6 GHz Next generation processors FDR EDR Interconnect Aries InfiniBand Aries InfiniBand 56 Gbps 100 Gbps Cores 11776 9216 40000 17000 244 180 1700 515 Tflops 2xLouhi 5xVuori 16xLouhi 15x Vuori Tflops total 424 3 6x Loun 2215 20 7x Louhi IT summary Cray XC30 supercomputer Sisu Fastest computer in Finland Phase 1 385 kW 244 Tflop s Very high density large racks f PLATFORMS Platforms prototype Phase 2 Very high density hot water cooled rack ntel processors Intel and NVIDIA accelerators Theoretical 400 I Flops performance CSC presentation 19 II summary cont o HP Taito 1152 Intel CPUs 180 TFlop s 30 kW 47 U racks DataDirect NETWORKS HPC storage PB of fast parallel storage 3 Supports Cray and HP systems CSC presentation 20 Features 7 Cray XC30 Completely new system design o Departure from the KT design 2004 First Cray with Intel CPUs High density water cooled chassis o 1200 cores chassis New Aries interconnect HP Cluster Modular SL series systems Mellanox FDH 56 Gbps Interconnect CSC presentation 21 CSC new systems What s new Sandy Bridge CPUs 4 58 c
8. ded 7 Cray T3E 512 proc HP CP4000BL Proliant 465c 100 expanded e DC AMD e Cray T3E 224 proc S HP DL 145 Proliant 192 proc e EX v ES e ME M Murska Cray C94 e Sun Fire 25K decommissioned 10 5 IBM SP Power3 Federation HP switch 6 2012 IBM SP2 S Compaq Alpha cluster Clux upgrade on IBM IBM SP1 z e SGI upgrade mM 2 LA Cray T3E decommissioned waiver SEM rS i seo 7 12 2002 Clux and Hiisi decommissioned PRY Ps e rigin 2000 2 2005 1 aa Cray X MP 416 e IBM upgrade 0 IBM SP2 decommissioned e 64 proc e 1 1998 AL 6 Convex C220 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 50 Top500 rating he Top500 ttp www top org 1993 2012 were started in 1993 Cray T3E Cray XT4 XT5 IBM eServer p690 100 Cray XC30 150 A HP Proliant 465c DC HP Proliant SL230s 200 SGI Origin 2000 250 SGI Power Challenge IBM SP2 300 IBM SP Power 350 Cray X MP HP Proliant 465c 6C e Cray C94 A 400 9 IBM SP1 Digital AlphaServer 450 500 Convex C3840 CSC presentation 6 7 m rd e 7 Ze 7 Se d e E es ru F M j Zb WALITIA 7 CSC presentat ion e DC1 2005 500kW 1 62 PUE DC 2 2008 800kW 1 38 PUE DC 3 2012 xMW 1 2
9. irections In parallel csc programming MPI 3 standard being finalized Asynchronous collective communication etc Partitioned Global Address Space PGAS Data sharing via global arrays Finally starting to see decent performance Most mature Unified Parallel C Co Array Fortran in Fortran 2008 OpenSHMEM e ask Dataflow based parallel models Splits work into a graph DAG of tasks SmpSs DAGUE StarPU fF eee a Hiiti LA 5 22 e cn a AZ d f CSC presentat ion Currently available computing resources Massive computational challenges Louhi gt 10000 cores 111B memory Theoretical peak performance gt 100 Tflop s HP cluster Vuori Small and medium sized tasks Theoretical peak performance gt 40 Iflop s d Application server Hippu Interactive usage without job scheduler Postprocessing e g vizualization e FGI CSC presentation Novel resources at CSC 1 Production available for all Finnish researchers Vuori 8 Tesla GPU nodes FGI 88 GPUs 44 Tesla 2050 44 Tesla 2090 o GPU nodes located at HY Aalto AA TTY Testing primarily for CSC experts Tunturi Sandy Bridge node cluster e Porting to AVX instruction set Mictest Intel MIC prototype node oeveral beta cards CSC Old capacity decommissions Louhi decommissioned after new Cray
10. licy Special 17 Special Grand Challenge needs Need to move a lot of files from where e Do you need GPGPU MICs Which code
11. nd and for 37 state research institutes and other organizations Network Services and Light paths Network Security Funet CERT eduroam wireless network roaming Haka identity Management Campus Support The NORDUnet network v Data services Digital Preservation and Data for Research e Data for Research TTA National Digital Library KDK e International collaboration via EU projects EUDAT APARSEN ODE SIM4RDM Database and information services Paituli GIS service Nic funet fi freely distributable files with FTP since 1990 vy CSC Stream Database administration services Memory organizations Finnish university and polytechnics libraries Finnish National Audiovisual Archive Finnish National Archives Finnish National Gallery FUNET and Data services CSL Current HPC System Environment Name Louhi Vuor Type Cray STAIR HP Cluster DOB 2007 2010 Nodes 1864 304 CPU Cores 10864 3648 Performance 110 TFlop s 34 TF Total memory 11 TB o TB Interconnect Cray QDR IB SeaStar Fat tree 3D Torus CSC presentation 5 CSC Computing Capacity 1989 2012 Standardized max capacity 80 Cray XC30 eee processors capacity used 10000 9 Gray TE es eg e e IBM eServer Cluster 1600 Cray XT4 DC HP Proliant Two Compag Alpha Servers ecce x HP CP4000 SL230s 1000 Lempo and Hiisi DL Proliant CrayxT4 Qc 8C AMD e Ue Cray T3E expan
12. o D SA S gt TECHNICAL AREA _ ELECTRICAL ROOMS AREA 3000 m2 Cra e Transformers UPS FUNET Cross connections Storage Orage rooms 1 ape Obotics D O O C 0 0 io o CO CC 0 0 d O Y SGI MDC Starting with one head unit Vestibule and two expansion modules extra capacity can be increased by introducing more expansion units Thanks to dozens of automated cooling fans the energy needed for cooling can be adjusted very accurately as IT capacity is increased gradually Our baby in Italy Internal temperature setpoint 27 C ASHRAE and occasionally ASHRAE tolerated 27 30 C during possible summer heat waves As long as outdoor temperatures are less than 28 C Unit does nothing but free cooling During heat waves extra water and some chillers possibly needed During winter the exhaust warm air Is re circulated to warm up the incoming air CSC presentation Sg 1 Data center specification CSL e 2 4 MW combined hybrid capacity 1 4 MW modular free air cooled datacenter Upgradable in 700 kW factory built modules Order to acceptance in 5 months 35 kW per extra tall racks 12 KW common in industry PUE forecast gt 1 08 PPUE gt yc o 1MW HPC datacenter Optimised for Cray super amp T Platforms prototype 90 Water cooling CSC presentation 16 Overview of
13. ores socket 2 3x Louhi flops socket 256 bit SIMD instructions AVX LIEN CS EN ORS e Interconnects Md IA Performance improvements TR GRE o Latency bandwidth collectives One sided communication New topologies e Cray Dragonfly Islands of 2D Meshes HP Islands of fat trees CSC presentation 22 1 8 v a 0 V UTTTTTTTTTTTT T4 T d er im ing Ine Cray Dragonfly Topology Source Robert Alverson Cray Hot Interconnects 2012 keynote csc All to all network between groups Cam Cam E Ca aka E 7 EH 7 j Lc Em D m di 4 L j d E i T ihe hy ch Ke E 9 E RRR a T n d 7 7 7 7 4 M een 2 doen of men EUM EM Pn Ban RERO S UG L d d MAMM mni KB RE RET 2 dimensional S all to all network 2222 2 E YA A 4 1 1 kas E In r Gen A ME e AN Sch S Aum i v NA Mi i mi Y 35 1 3 N u s ss Ai Le 5 NW 3 Er 4 4 7 amp Zei L 7 ELS we H 1 D Y 2 im Um WA un WARS WW WW S A See Wa ef MER 0990 e ee IS d c bas E 1 3 i L 4 Bee i a i1 0 fei E d ke die K Ea 2 EU d CG e 1
14. porting your code List of first codes others added later some upon request User accounts HP recent vuori users moved automatically Cray recent Louhi users moved automatically Others request from usermgr csc fi with current contact information CSC presentation 41 Grand Challenges Y Normal GC call out in the end of December 2012 new CSC resources available for a year no bottom limit for number of cores Special GC call mainly for Cray out in the end of December 2012 possibility for short day or less runs with the whole Cray What do you need Remember also PRACE DECI http www csc fi english csc news customerinfo DECI 1 Ocall open m na m MM RN ES ue ff aa WOW DFeSentatlon uu DY ud NV UWA LINI Accelerators Add on processors Graphics Processing Units de facto accelerator technology today e Many lightweight cores e Influencing general purpose computing CSC presentation Fast memory Accelerator InfiniBand 43 E csc Evolution of CPU and GPU performance Memory bandwidth Energy efficiency O 180 00 8 140 00 o 120 00 1 10 EE 0 E L 6 E A 4 TU 5 80 80 00 1 ant 6 60 00 e m deser ZC 0 00 0 2008 2009 2010 2011 2 0 h KA SO o Oy Oy 44 CSC presentation Future d
15. rforms well for a variety of tasks Very small virtualization overhead for most tasks heavily I O bound CSC presentation and MPI tasks affected more CSL Cloud Biomedical pilot cases Several pilots 15 Users from several institutions e g University of Helsinki Finnish Institute for Molecular Medicine and Technical University of Munich e Many different usage models e g Extending existing cluster Services run on CSC laaS by university IT department for end users SaaS for end users Go presentation t NX for remote access 7 Optimized remote desktop access Near local speed application responsiveness over high latency low bandwidth links e Customized launch menus offer direct access CSC supported applications e Working session can saved and restored at the next login Further information htto www csc fi english research sottware freenx File Edit Vie Gi m ls CC 1 teat 5 13 gt Vim type D builtin x type if Sphere type LEI cylinde type Properties Display onxlogin csc fi 1012 nxlogin3 GPL Edition VIN Vi IMproved version 7 2 411 bu Bram Moolenaar et al Modified by lt bugzilla redhat com gt is open source and freely distributable Help poor children in Uganda thelp iccf lt Enter gt for information tq lt Enter gt to exit thelp lt Enter gt or F1 for on line help thelp version l
16. rks php During and after pilot usage share your makefiles and optimization experiences in the wiki http software intel com sites default files m d 4 1 d 8 optaps cls pdtf http software intel com sites default files m d 4 1 d 8 optaps Tor pf 5 Modules Some software installations are conflicting with each other For example different versions of programs and libraries o Modules facilitate the installation of conflicting packages to a single system User can select the desired environment and tools using module commands Can also be done on the fly Taito module system Old module system has not been actively maintained for a long time Robert McClay has written a new and modified version from a scratch More robust Slightly different internal logic Actively developed bug fixes new features like support for accelerators etc CSC Key differencies Taito vs Vuori e module avail shows only those modules that can be loaded to current setup no conflicts or extra dependencies Use module spider to list all installed modules and solve the conflics dependencies No PrgEnv modules Changing the compiler module switches also MPI and other compiler specific modules CSC presentation 33 Disks at Espoo hippus CSC fi louhi csc fi hippu csc f fi hil 9pu4 csc fi Vuori csc fi A RCH p i Kg z Vi LU J Y La 5
17. s 2 6 GHz GHz AMD Opteron 2 1 GHZ X5650 Xeon E5 20670 and Intel Xeon Interconnect Aries FDRIB QDR IB QDR IB Cores 11 76 9216 10864 3648 7308 360 2 AGB RAM core 16x 256GB node 1 2 8GB 2 4 8GB AGB Tflops 244 180 102 33 95 4 GPU nodes in Phase2 8 88 Disc space 2 4 PB 110 145 TB 1 PB 0 8 TB Conclusions Performance dhfr benchmark 30k atoms PME comparison Gromacs Per core performance E Sisu 2 x compared to Vuori Louhi Better louhi Interconnects enhance scaling o Larger memory o Collective communication CSC presentation 60 Way ee lt lt eru co e c 7 29 6 EE 0 0 Ki 7 0 9 a RN c 0 0 V Q Ou gt SaS e JT OG Og e E He Wy ae eS 1 10 0 o o E EES um dr 10 9159 O 0 NER Wig l RRO EEE 5 0 Orn dloo e S 0 Ir 10 109 Ur JI lo ln lal 9 9 ates E l l l Or 0 y 9 Coss gt c X 5 Ors a 8 1 xA i Ny A Xx 1 Nino Runeberg Atte Sillanpaa CSC IT Center for ds S ef EK RON KEN L D S SEA e zb rc i Round robin Your research interest how CSC can help Queue length 3 days enough Codes that cant checkpoint Other reasons e Is memory an issue for you 256 GB nodes usage po
18. ss than 10 MB is small Keep the number of files small lar and bzip files E e Time to retrieve any file 3 min 1s 10MB e Don t use ARCHIVE for incremental backup store delete overwrite store Space on tape is not freed up until months or years Maximum file size 300GB Default quota 2 per user new likely up to 5 TB New ARCHIVE being installed consider if you really need all your old files Transfer from old to new needed CSC presentation 38 5 Use profiles aito HP Sisu Cray XE30 Serial and parallel Parallel up to upto about 256 thousands of cores cores TBD o Scaling tests Queue server policies Y e Longrun queue has drawbacks Shorter Jobs can be chained Apps that cant restart write checkpoint Code you use to run very long jobs o Large memory jobs to Hippu HP big memory nodes hink about memory consumption Minimum job size in Cray Your input C C C Documentation and support User manual being built FAQ here httos datakeskus csc fi en web guest fag knowledge base Pilot usage during acceptance tests User documetation s link collection http www csc fi english research sciences chemistry Intro First HP Workshop materials http z www csc fi english csc courses archive taito workshop Porting project All code needs to be recompiled Help available for
19. t Enter gt for version info View X Visible Zoom To Data X Selectable Color Apply Texture None Color by Solid Color Set Solid Color Jr 15 1 x 56 Customer training o Taito HP aito cluster workshop 28 30 11 2012 6 Materials Next one likely in January 2013 Sisu Cray February 26 March 1 2013 mostly for pilot users open for everyone May 14 May 17 for all users a PATC course i e expecting participants from other countries too CSC presentation 57 d C 3 C C How to prepare for new systems Participate in system workshops Try Intel GNU compiler in advance PGI upon request Check if your scripts aliases need fixing bash A lot of resources available in the beginning prepare ahead what to run The traditional wisdom about good application performance will still hold Experiment with all compilers and pay attention on finding good compiler optimization flags Employ tuned numerical libraries wherever possible Experiment with settings of environment variable that control the MPI library Mind the I O minimize output checkpoint seldom Sisu amp Taito vs Louhi amp Vuori vs FGI vs Local Cluster Sisu amp Taito i Availability 10 20 2013 Available Available Available Intel Sandy AMD Opteron 2 3 GHz Bridge 8 Barcelona and 2 7 Intel Xeon CPU GHz Shanghai 2 6 2 x 6 cores core

Download Pdf Manuals

image

Related Search

Related Contents

Samsung Blu-ray проигрыватель BD-F5100  User manual KNXgal  Kodak ZD15 User's Manual  LG LMVH1750SB User's Manual  取扱説明書 JL-G/L ゴールド天びん - メトラー・トレド - Mettler  Samsung S27A950D Felhasználói kézikönyv  Operating Instructions  KRZR K1 - Guía del usuario  NewKeyBuy Centered  AM-1取扱説明書  

Copyright © All rights reserved.
Failed to retrieve file