Servers Outlook
Server for INFN – End 2007
Outlook
Barcelona
October 3rd 2007
michele michelotto - CCR
2
Dual core
October 3rd 2007
Opteron Price
1,000.00
900.00
800.00
700.00
Us Dollar 1000s
• AMD: Opteron
22xx
• La serie base
consumo costo
poco meno di
quella a basso
consumo
• La serie ad alto
consumo costa
come quella
normale ma
estende il clock
600.00
22xx HE
500.00
22xx
22xx SE
400.00
300.00
200.00
100.00
0.00
1800.00 2000.00 2200.00 2400.00 2600.00 2800.00 3000.00 3200.00
Clock
michele michelotto - CCR
3
4core Barcelona
Opteron Price
1,000.00
900.00
800.00
700.00
Us Dollar 1000s
• Prezzi doppi allo stesso
clock
• Clock iniziali molto
bassi (problemi di
resa?)
22xx HE
600.00
22xx
500.00
22xx SE
4core 23xxHE
400.00
4 core 23xx
300.00
200.00
100.00
17
00
.0
0
18
00
.0
0
20
00
.0
0
22
00
.0
0
24
00
.0
0
26
00
.0
0
28
00
.0
0
30
00
.0
0
32
00
.0
0
0.00
Clock
October 3rd 2007
michele michelotto - CCR
4
4core Barcelona
Opteron Price/core/clock
0.160
0.140
0.120
Us Dollar 1000s
• Prezzo per clock per
core
• Conviene prendere
clock bassi e 4core
• (fino a quando le
prestazioni seguono il
carico)
22xx HE
0.100
22xx
0.080
22xx SE
4core 23xxHE
0.060
4 core 23xx
0.040
0.020
17
00
.0
0
18
00
.0
0
20
00
.0
0
22
00
.0
0
24
00
.0
0
26
00
.0
0
28
00
.0
0
30
00
.0
0
32
00
.0
0
0.000
Clock
October 3rd 2007
michele michelotto - CCR
5
Prezzi attuali
• Worker node “alla dozzina”
– Conviene prendere macchine con dual processore 4core.
2 GB per core
– Non so ancora i prezzi di macchine con Barcelona
– Interessanti le twin: due blade in una pizza box 1U
October 3rd 2007
michele michelotto - CCR
6
October 3rd 2007
michele michelotto - CCR
7
October 3rd 2007
michele michelotto - CCR
8
October 3rd 2007
michele michelotto - CCR
9
October 3rd 2007
michele michelotto - CCR
10
Tick Tock
October 3rd 2007
michele michelotto - CCR
11
October 3rd 2007
michele michelotto - CCR
12
I 4 core
• Clovertown 2007, Barcelona Q4-07,Penryn
Q1-08
October 3rd 2007
michele michelotto - CCR
13
Prossimo Tock
• Dopo il tick
Penryn nel 2009
arriva il tock
Nehalem
• Nuovo raddoppio
dei cores
• Fino a due thread
per core
• Intel Quickpath
simile a Amd
Hypertransport?
• Memory controller
integrato come
Amd
October 3rd 2007
michele michelotto - CCR
14
2 socket – 8 core – 16 cpu logiche
October 3rd 2007
michele michelotto - CCR
15
October 3rd 2007
michele michelotto - CCR
16
Miglioramenti a 45nm
October 3rd 2007
michele michelotto - CCR
17
Chip di test
October 3rd 2007
michele michelotto - CCR
18
Server
October 3rd 2007
michele michelotto - CCR
19
desktop
• Per desktop e workstation
• Un pc desktop con un 4core produce come un dual
dual core (pythia, cms, root)
October 3rd 2007
michele michelotto - CCR
20
SUN + K10
•
•
•
•
SUN X4600 M2
8 blade con barcelona
Totale 32 cores
256GB DDR2 per
chassis
• 4 alimentatori hotswap
• 6 slot PCIe 8x
• Sotto la versione
piccola con “solo” 4
processori
October 3rd 2007
michele michelotto - CCR
21
Futuro amd
• 2008 Shangai a 45 nm (cache maggiori)
• 2009 Versione 8 core sempre a 45 nm
October 3rd 2007
michele michelotto - CCR
22
1H 08 ?
October 3rd 2007
michele michelotto - CCR
23
Load transactional
Le macchine con i
nuvoi processori
Quad-core
continuano a
migliorare
Piccola flessione del
clovertown
Invece l’unico dual
core comincia a
saturare
October 3rd 2007
michele michelotto - CCR
24
Consumi
• Notevole il
miglioramento
del Xeon 5472
vs Xeon 5365
• Stesso clock
ma passaggio
da 65nm a
45nm
• Barcelona
consuma
meno ma a
2.0 GHz
October 3rd 2007
michele michelotto - CCR
25
Perf/watt
• Barcelona a
65nm rende
come lo xeon
a 45nm
• NB in questo
particolare
benchmark
October 3rd 2007
michele michelotto - CCR
26
Consumi idle
• AMD molto parchi in idle
October 3rd 2007
michele michelotto - CCR
27
AMD preferisce gcc
October 3rd 2007
michele michelotto - CCR
28
October 3rd 2007
michele michelotto - CCR
29
Effetti delle cache
• Nonostante cache maggiori la latenza del 54xx migliorata
• Impressionante differenza di banda tra 4MB e 64MB.
• Se la vostra applicazione sta spesso in questa zona vedrete
differenze sensibili nelle prestazioni
October 3rd 2007
michele michelotto - CCR
30
Memoria intel vs amd
• Tempi di accesso simili
• A 1GB (occupazione media di programmi HEP) i
nuovi AMD hanno risultati migliori
• Ma i nuovi Xeon 54xx migliorano molto i 53xx
October 3rd 2007
michele michelotto - CCR
31
Mem intel e amd
• Dipende dove
misuri, quindi
dipende dove
lavora la tua
applicazione
October 3rd 2007
michele michelotto - CCR
32
October 3rd 2007
michele michelotto - CCR
33
October 3rd 2007
michele michelotto - CCR
34
Consumo della memoria
• Gli intel
usano
memoria FBDIMM con
consumi
maggiori
October 3rd 2007
michele michelotto - CCR
35
October 3rd 2007
michele michelotto - CCR
36
High K dielectrics
• Uso di
ossido di
Afnio nel
gate
• Costante
dielettrica
maggiore
implica
minori
correnti di
leakage
October 3rd 2007
michele michelotto - CCR
37
October 3rd 2007
michele michelotto - CCR
38
Alimentazione DC
Nell’UPS converto in continua
in parallelo alle batterie
Poi riconverto in alternata
verso la power distribution unit
Entro nel rack in alternata ma
nell’alimentatore di nuovo
converto in continua
X
X
E di nuovo nelle diverse
correnti continue
Poi ci sono delle perdite nei
voltage regulator
Solo il 48% arriva alla
macchina (ventole, dischi, cpu,
memoria)
October 3rd 2007
michele michelotto - CCR
39
Demo di alimentazione DC
October 3rd 2007
michele michelotto - CCR
40
Data center
October 3rd 2007
michele michelotto - CCR
41
Memorie DRAM
October 3rd 2007
michele michelotto - CCR
42
October 3rd 2007
michele michelotto - CCR
43
October 3rd 2007
michele michelotto - CCR
44
CCR 3 Ottobre 2007
October 3rd 2007
michele michelotto - CCR
45
e oltre
October 3rd 2007
michele michelotto - CCR
46
October 3rd 2007
michele michelotto - CCR
47
FC su Ethernet
• Incapsula frame FC nei
frame Ethernet usando
Ethernet da data center
– Congestion notification
802.1Qau
– Shortest path bridging
802.1Qaq
– Virtual Links 802.1Q
– Priority based flow control
– Data Center enhancement
nel 2009
• Riesce ad evitare il TCP
(checksum, connection
etc…)
October 3rd 2007
michele michelotto - CCR
48
October 3rd 2007
michele michelotto - CCR
49
October 3rd 2007
michele michelotto - CCR
50
October 3rd 2007
michele michelotto - CCR
51
•
•
•
•
•
•
•
Up to 1 Terabyte of non-volatile
DDRRAM in 24U.
Unlimited overall capacity
Over 3.2 million random I/O requests
per second.
Over 24 GB/second of random
sustainable data bandwidth.
Up to 512 physical LUNs.
Requires 2,500 watts of power.
Up to 8 independent non-volatile
solid state disks (SSD) modules.
Each SSD module is a RamSan-400,
including 128 GB of DDRRAM and
up to eight 4-Gbit Fibre Channel
connections or four 4x InfiniBand
ports.
October 3rd 2007
michele michelotto - CCR
52
Solid state disk
• 1.8-inch 64 GB SSD
• 2.5-inch 128GB SSD
• 2.5-inch 256 GB SSD
October 3rd 2007
michele michelotto - CCR
53
3.5 inch SSD
• SimpleTech has released the
256GB 3.5″ SSD and announced
512GB one to come in Q3, this
year
• The drives are called Zeus-IOPS
and currently offer the biggest
capacity on the market of 3.5
inch SSDs
• Upon that, the drives are also
extremely fast, at 200x, but,
unfortunately are very expensive.
The 512GB is expected to be
priced at about $3,000 when
launched and it may get to
$1,000 in a few years, but then
we will probably have a few
terabytes drives.
• 200x performances of a 15000
rpm drive.
October 3rd 2007
michele michelotto - CCR
54
640GB disk on PCI
• PCI EXpress X4 is
used. Transfer speed is
close to 1GByte/sec.
Thus, this 640GB drive
above will be read in
some 11 minutes ! How
long we have waited
until HD brake 1TByte
limit?
October 3rd 2007
michele michelotto - CCR
55
October 3rd 2007
michele michelotto - CCR
56
October 3rd 2007
michele michelotto - CCR
57
October 3rd 2007
michele michelotto - CCR
58
October 3rd 2007
michele michelotto - CCR
59