Acceleriamo con Linux on POWER
e DB2 BLU
Demo Days - Segrate
18 Maggio 2015
Ernesto Beneduce
Michele Benedetti
Client Technical Architect
Big Data & Analytics
IBM Systems
DB2 for LUW and Cloudant Specialist
IBM Software Group
[email protected]
[email protected]
© 2015 IBM Corporation
The modern analytics scenario
2
IBM infrastructure solutions for Big Data
Powered by
IBM Solution for Hadoop – Power Systems Edition
+
High performance, flexible tuning & configuration, wide
storage options
S822L servers
JBOD enclosure
InfoSphere
BigInsights
http://www-03.ibm.com/systems/power/solutions/bigdataanalytics/hadoop/
IBM Data Engine for Analytics – Power Systems Edition
Customized “appliance-like” infrastructure solution with
storage-dense options
http://www-03.ibm.com/systems/power/solutions/bigdata-analytics/data-engine/
Hadoop on System z
The reliability of the mainframe,
the agility of Hadoop
IBM InfoSphere BigInsights for
Linux on System z
IBM InfoSphere System
Connector for Hadoop
DB2 z/OS V11 integration with
BigInsights
•
•
3
Invoke JSON queries via JAQL
from DB2 UDF
Retrieve data from Hadoop via
HDSF_READ
http://www-01.ibm.com/software/os/systemz/biginsightsz/
IBM infrastructure solutions for Analytics
IBM Solution for Analytics – Designed for COGNOS/SPSS
High performance, flexible tuning & configuration, wide storage
options
IBM DB2 BLU on POWER Linux & AIX
Optimized hardware configurations for BI & reporting
analytics workload.
IBM Data Engine for NoSQL - Power Systems Edition
Extreme consolidation and significant cost savings for InMemory NoSQL Data Stores
4
*DB2 lab is working on DB2 BLU Acceleration on Little Endian Power Linux
IBM Database Management
DBMS & C.
DB2 for z/OS
RDBMS for mainframe
environments
DB2 for LUW
RDBMS for
distributed
environments
Database
tools
Available for
- DB2 LUW
- DB2 z/OS
- IMS
Hierarchical
DBMS for
mainframe
environments
DB2 Connect
Support for distributed
RDBMS environments
IDS
DB2 for i
(Informix Dynamic Server)
Integrated RDBMS for
Power Systems with IBM i
High level positioning
IMS
RDBMS for
distributed
environments
DB2 Family
DB2 z/OS: Resilience, Availability, Security, Incumbency, Consolidation Platform, Foothold in key Industries
(Banking, Insurance, etc.)
DB2 LUW:
XML, Warehousing, SAP, Certain ISVs (AMDOCs, Temenos, etc.), OLTP (Note: TPC and SAP
Compression, pSeries synergy, Consolidation platform, Migration target, Filenet,
Websphere, Tivoli
Benchmarks),
5
DB2 for LUW value: flexibility
Different Architectures to better address specific workloads
DB2 “Classic” SMP
DB2 pureScale
DB2 BLU
Acceleration
Express
Personal
Express-C
Advanced
Workgroup
Workgroup
DB2 Parallel for
DW
Advanced
Enterprise
Enterprise
+ Database Enterprise Developer
Edition
Different Editions for different
workload volumes,
multiplatform (Linux, Unix,
Windows)
Same code base
In-Memory Native
Column Store
Cohexist wiht Row Store
(SMP)
“Shared-Disk”
architecture inspired to
z/OS.
Base for PureSystem
“PureData for
Transactions”
“Shared-Nothing”
architecture as foundation
for “IBM Infosphere
Warehouse” and
PureSystem “PureData for
Operational Analytics”
Different types of workload and different service level requirements are best
addressed by adopting the most appropriate underlying architecture for the dbms
6
DB2 BLU Acceleration: Simplification of Analytic Operations
Traditional Warehouse
Database Design and Tuning
1. Decide on partition strategies
2. Select Compression Strategy
3. Create Table
4. Load data
5. Create Auxiliary Performance
Structures
• Materialized views
Repeat
• Create indexes
• B+ indexes
• Bitmap indexes
6. Tune memory
7. Tune I/O
8. Add Optimizer hints
9. Statistics collection
AFTER
DB2 with BLU Acceleration
1. Create Table
2. Load data
Create
Load
GO!
IBM DB2 with BLU Acceleration: technical keypoints
Next Generation In-Memory
Actionable Compression
In-memory columnar processing with
dynamic movement of data from storage
Patented compression that preserves order
so data can be used without decompressing
C1 C2 C3 C4 C5 C6 C7 C8
Encoded
Instructions
Data Skipping
Parallel Vector Processing
Multi-core and SIMD parallelism
(Single Instruction Multiple Data)
Data
Results
Skips unnecessary processing of
irrelevant data
DB2 BLU Acceleration runs Oracle Code
Editor
• Oracle compatibility with BLU
Acceleration
• Built in PL/SQL compiler
• Source level debugging and profiling
PL/SQL
Compiler
Data Studio
SQL PL
Compiler
Debugger
DB2 Server
SURE
(SQL Unified Runtime Engine)
Database
BinLsiUde
Profiler
DB2 BLU Shadow Tables for Reporting on OLTP
• Faster OLPT – fewer indexes
• Dramatic reduction in
indexes on the row table
• Faster Reporting – BLU
Acceleration!
• 10X-40X faster.
• Dual representation. Data
stored as both row and coumn.
The best of both worlds
• No application change.
Database query compiler
decides which format to access.
Fully automated.
• Small memory needs.
Roworganized
Columnorganized
BinLsiUde
Sales
BLU and P8: Optimization of the entire hardware stack
In-Memory
Optimized
•
•
•
Memory latency
optimized for
– Scans
– Joins
– Aggregation
More useful data
in memory
– Data stays
compressed
– Scan friendly caching
Less to put in memory
– Columnar access
– Late materialization
– Data skipping
© 2013 IBM Corporation
11
CPU Optimized
I/O Optimized
CPU acceleration
– SIMD processing for
• Scans
• Joins
• Grouping
• Arithmetic
Less to read
– Columnar I/O
– Data skipping
– Late materialization
Read less often
– Scan friendly caching
•
Keeping the CPUs busy
– Core friendly parallelism
Efficient I/O
– Specialized columnar
prefetching algorithm
•
Less CPU processing
– Operate on compressed
data
– Late materialization
– Data skipping
•
Typical analytics scenario
DWH
Server
DWH
Service
Bus
ETL
Existing
Analytics
Applications
Server
DWH
Server
and/or
Operational
DB
DWH
db
BI Users
12
DWH evolution with DB2 BLU
Service
Bus
New DWH
Server
DWH
DB2 BLU
ETL
Analytics
Existing
Applications
Analytics
(allApplications
custom
SQL + BI tool
with DB2
AWSE
support)
Server
DWH
Server
and/or
Operational
DB
DWH
db
Simplified ETL,
RDBMS Federation
DB2
BLU
BI Users
13
Business Analytics Accelerator on POWER 8
What is it?
A soft-bundled appliance, made up of POWER8 + DB2 with
BLU Acceleration (plus 5 Cognos Licenses for non-Cognos
customers .
Speed Matters
It provides hi-speed and real-time analytics functionalities
that enable companies to create brand new analytics models,
with an optimized and efficient information lifecycle
with faster and better business insight
What about my current data and DWHs?
Most customers already “extract” the data from operational
systems and send it to the DWH repository.
The Business Analytics Accelerator is “load and go”, thus
the customer just has to point the data at it and it does the turbo
charging automatically.
* Registration of valid IBM SWG licenses required prior to server shipment with selected options.
(http://www.ibm.com/systems/power/hardware/solutioneditions/registration.html)
14
15
© 2015 IBM Corporation
DB2 BLU on POWER Linux
Performance Demo
Descrizione dell’ambiente di test
Dettagli sulla tipologia dei test
Caratteristiche e configurazioni delle istanze DB2
Test prestazionali: strumenti di iniezione del carico ,
esecuzione query e valutazione metriche
16
© 2015 IBM Corporation
Ambiente di test: Caratteristiche
I test saranno eseguiti su hardware POWER 8, S822L, in una LPAR
alla quale sono assegnati 8 virtual cores e 32 GB di RAM
Il sistema operativo è Red Hat version 7, Little Endian, sotto
hypervisor POWER KVM
Scopo dei test: dimostrare le migliori prestazioni di DB2 in modalità
colonnare/in-memory rispetto alla modalità “classica”, basata su
organizzazione per righe (“row-based”)
17
© 2015 IBM Corporation
Ambiente DB2: dettagli
2 istanze separate (“db2inst1” e “db2iblu”) ospitano gli stessi
dati, organizzati in due distinti database delle dimensioni di 5
GB, uno in modalità standard NON compresso (“DBROW” ),
l’altro in modalità colonnare compresso (“DBCOL”)
Entrambe le istanze ed entrambi i database sono generati ed
allocati su dischi interni al server S822L
Il modello dei dati e le query sono state ottenute dal benchmark
TPC-H, disponibile sul sito del Transaction Performance Council
Sul database “row-based” sono state eseguite attività preventive di
tuning prestazionale (indicizzazione), mentre sul database
colonnare non è stata eseguita alcuna attività
18
© 2015 IBM Corporation
Descrizione sintetica dei test
I test prevedono l’esecuzione e la misurazione dei tempi di risposta
di un set ridotto di query a scelta tra quelle messe a disposizione
dal query-set TPC-H
Le misurazioni saranno di tipo comparativo, ovvero verrà eseguito
lo stesso workload nei due ambienti, colonnare e row-based, e
verranno confrontati i tempi di risposta ed il throughput complessivo
In alcuni casi significativi, saranno mostrati i piani di accesso ai dati
elaborati dall’ottimizzatore di DB2
Gli strumenti di iniezione e misurazione del carico saranno IBM
Data Studio e Apache Jmeter
19
© 2015 IBM Corporation
BACKUP SLIDES
20
IBM BDA offering overview
© 2015 IBM Corporation
Hong Kong’s first wireless commercial television station
implements social media analytics to increase ratings
“While traditional TV ratings research will
continue to be important, it must be
augmented by social media intelligence.”
Unlock the value of customer
sentiment in social media
-- KC Leung, Senior Manager,
Marketing Research and Information Department,
TVB will mine more than three
decades of program ratings data to
understand the trends of media
consumption habits.
•IBM Social Media Analytics (SaaS)
•IBM DB2 with BLU Acceleration
•IBM Cognos BI, IBM DataStage
•IBM Power Systems
•IBM Storwize V7000
21
Learn more (Press)
© 2015 IBM Corporation
World-leading manufacturer of sensor solutions gained
faster insight into markets and customers
Faster insight into critical data
for better business decisions
“We cut report runtimes by up to 98 percent
thanks to IBM DB2 with BLU Acceleration (on
Power Systems) technology – without
changing operations processes or investing in
new hardware or software.”
-- Bernhard Herzog, Team Manager
Information Technology SAP, Balluff.
Achieved 98% faster access to business
data, 50% faster SAP ERP response times,
7x faster access to documents, and near
real-time access to essential information.
•IBM Power Systems
•IBM PowerVM, PowerHA
•IBM DB2 with BLU Acceleration
•SAP Business Warehouse, ERP
•IBM Storage & Services
Learn more (Press, Case Study)
22
© 2015 IBM Corporation
Where to learn more about Big Data & Analytics
on IBM Power Systems
Start the conversation with your IBM
Representative or Business Partner
Open innovation to
put data to work
across the enterprise
Join Power Systems in social media!
Connect with Power on Linkedin: bit.ly/poweronlinkedin
Like us on Facebook: bit.ly/poweronfacebook
Watch us on YouTube: bit.ly/poweronyoutube
Follow us on Twitter: #PowerSystems, #OpenPower,
#IBMBigData, #IBMAnalytics, #IBMWatson, #IBMBLU,
#IBMCognos, #IBMSPSS, #IBMStream, #IBMBigInsights,
#BobFriske
www.ibm.com/power
23
© 2015 IBM Corporation