Acceleriamo con Linux on POWER e DB2 BLU Demo Days - Segrate 18 Maggio 2015 Ernesto Beneduce Michele Benedetti Client Technical Architect Big Data & Analytics IBM Systems DB2 for LUW and Cloudant Specialist IBM Software Group [email protected] [email protected] © 2015 IBM Corporation The modern analytics scenario 2 IBM infrastructure solutions for Big Data Powered by IBM Solution for Hadoop – Power Systems Edition + High performance, flexible tuning & configuration, wide storage options S822L servers JBOD enclosure InfoSphere BigInsights http://www-03.ibm.com/systems/power/solutions/bigdataanalytics/hadoop/ IBM Data Engine for Analytics – Power Systems Edition Customized “appliance-like” infrastructure solution with storage-dense options http://www-03.ibm.com/systems/power/solutions/bigdata-analytics/data-engine/ Hadoop on System z The reliability of the mainframe, the agility of Hadoop IBM InfoSphere BigInsights for Linux on System z IBM InfoSphere System Connector for Hadoop DB2 z/OS V11 integration with BigInsights • • 3 Invoke JSON queries via JAQL from DB2 UDF Retrieve data from Hadoop via HDSF_READ http://www-01.ibm.com/software/os/systemz/biginsightsz/ IBM infrastructure solutions for Analytics IBM Solution for Analytics – Designed for COGNOS/SPSS High performance, flexible tuning & configuration, wide storage options IBM DB2 BLU on POWER Linux & AIX Optimized hardware configurations for BI & reporting analytics workload. IBM Data Engine for NoSQL - Power Systems Edition Extreme consolidation and significant cost savings for InMemory NoSQL Data Stores 4 *DB2 lab is working on DB2 BLU Acceleration on Little Endian Power Linux IBM Database Management DBMS & C. DB2 for z/OS RDBMS for mainframe environments DB2 for LUW RDBMS for distributed environments Database tools Available for - DB2 LUW - DB2 z/OS - IMS Hierarchical DBMS for mainframe environments DB2 Connect Support for distributed RDBMS environments IDS DB2 for i (Informix Dynamic Server) Integrated RDBMS for Power Systems with IBM i High level positioning IMS RDBMS for distributed environments DB2 Family DB2 z/OS: Resilience, Availability, Security, Incumbency, Consolidation Platform, Foothold in key Industries (Banking, Insurance, etc.) DB2 LUW: XML, Warehousing, SAP, Certain ISVs (AMDOCs, Temenos, etc.), OLTP (Note: TPC and SAP Compression, pSeries synergy, Consolidation platform, Migration target, Filenet, Websphere, Tivoli Benchmarks), 5 DB2 for LUW value: flexibility Different Architectures to better address specific workloads DB2 “Classic” SMP DB2 pureScale DB2 BLU Acceleration Express Personal Express-C Advanced Workgroup Workgroup DB2 Parallel for DW Advanced Enterprise Enterprise + Database Enterprise Developer Edition Different Editions for different workload volumes, multiplatform (Linux, Unix, Windows) Same code base In-Memory Native Column Store Cohexist wiht Row Store (SMP) “Shared-Disk” architecture inspired to z/OS. Base for PureSystem “PureData for Transactions” “Shared-Nothing” architecture as foundation for “IBM Infosphere Warehouse” and PureSystem “PureData for Operational Analytics” Different types of workload and different service level requirements are best addressed by adopting the most appropriate underlying architecture for the dbms 6 DB2 BLU Acceleration: Simplification of Analytic Operations Traditional Warehouse Database Design and Tuning 1. Decide on partition strategies 2. Select Compression Strategy 3. Create Table 4. Load data 5. Create Auxiliary Performance Structures • Materialized views Repeat • Create indexes • B+ indexes • Bitmap indexes 6. Tune memory 7. Tune I/O 8. Add Optimizer hints 9. Statistics collection AFTER DB2 with BLU Acceleration 1. Create Table 2. Load data Create Load GO! IBM DB2 with BLU Acceleration: technical keypoints Next Generation In-Memory Actionable Compression In-memory columnar processing with dynamic movement of data from storage Patented compression that preserves order so data can be used without decompressing C1 C2 C3 C4 C5 C6 C7 C8 Encoded Instructions Data Skipping Parallel Vector Processing Multi-core and SIMD parallelism (Single Instruction Multiple Data) Data Results Skips unnecessary processing of irrelevant data DB2 BLU Acceleration runs Oracle Code Editor • Oracle compatibility with BLU Acceleration • Built in PL/SQL compiler • Source level debugging and profiling PL/SQL Compiler Data Studio SQL PL Compiler Debugger DB2 Server SURE (SQL Unified Runtime Engine) Database BinLsiUde Profiler DB2 BLU Shadow Tables for Reporting on OLTP • Faster OLPT – fewer indexes • Dramatic reduction in indexes on the row table • Faster Reporting – BLU Acceleration! • 10X-40X faster. • Dual representation. Data stored as both row and coumn. The best of both worlds • No application change. Database query compiler decides which format to access. Fully automated. • Small memory needs. Roworganized Columnorganized BinLsiUde Sales BLU and P8: Optimization of the entire hardware stack In-Memory Optimized • • • Memory latency optimized for – Scans – Joins – Aggregation More useful data in memory – Data stays compressed – Scan friendly caching Less to put in memory – Columnar access – Late materialization – Data skipping © 2013 IBM Corporation 11 CPU Optimized I/O Optimized CPU acceleration – SIMD processing for • Scans • Joins • Grouping • Arithmetic Less to read – Columnar I/O – Data skipping – Late materialization Read less often – Scan friendly caching • Keeping the CPUs busy – Core friendly parallelism Efficient I/O – Specialized columnar prefetching algorithm • Less CPU processing – Operate on compressed data – Late materialization – Data skipping • Typical analytics scenario DWH Server DWH Service Bus ETL Existing Analytics Applications Server DWH Server and/or Operational DB DWH db BI Users 12 DWH evolution with DB2 BLU Service Bus New DWH Server DWH DB2 BLU ETL Analytics Existing Applications Analytics (allApplications custom SQL + BI tool with DB2 AWSE support) Server DWH Server and/or Operational DB DWH db Simplified ETL, RDBMS Federation DB2 BLU BI Users 13 Business Analytics Accelerator on POWER 8 What is it? A soft-bundled appliance, made up of POWER8 + DB2 with BLU Acceleration (plus 5 Cognos Licenses for non-Cognos customers . Speed Matters It provides hi-speed and real-time analytics functionalities that enable companies to create brand new analytics models, with an optimized and efficient information lifecycle with faster and better business insight What about my current data and DWHs? Most customers already “extract” the data from operational systems and send it to the DWH repository. The Business Analytics Accelerator is “load and go”, thus the customer just has to point the data at it and it does the turbo charging automatically. * Registration of valid IBM SWG licenses required prior to server shipment with selected options. (http://www.ibm.com/systems/power/hardware/solutioneditions/registration.html) 14 15 © 2015 IBM Corporation DB2 BLU on POWER Linux Performance Demo Descrizione dell’ambiente di test Dettagli sulla tipologia dei test Caratteristiche e configurazioni delle istanze DB2 Test prestazionali: strumenti di iniezione del carico , esecuzione query e valutazione metriche 16 © 2015 IBM Corporation Ambiente di test: Caratteristiche I test saranno eseguiti su hardware POWER 8, S822L, in una LPAR alla quale sono assegnati 8 virtual cores e 32 GB di RAM Il sistema operativo è Red Hat version 7, Little Endian, sotto hypervisor POWER KVM Scopo dei test: dimostrare le migliori prestazioni di DB2 in modalità colonnare/in-memory rispetto alla modalità “classica”, basata su organizzazione per righe (“row-based”) 17 © 2015 IBM Corporation Ambiente DB2: dettagli 2 istanze separate (“db2inst1” e “db2iblu”) ospitano gli stessi dati, organizzati in due distinti database delle dimensioni di 5 GB, uno in modalità standard NON compresso (“DBROW” ), l’altro in modalità colonnare compresso (“DBCOL”) Entrambe le istanze ed entrambi i database sono generati ed allocati su dischi interni al server S822L Il modello dei dati e le query sono state ottenute dal benchmark TPC-H, disponibile sul sito del Transaction Performance Council Sul database “row-based” sono state eseguite attività preventive di tuning prestazionale (indicizzazione), mentre sul database colonnare non è stata eseguita alcuna attività 18 © 2015 IBM Corporation Descrizione sintetica dei test I test prevedono l’esecuzione e la misurazione dei tempi di risposta di un set ridotto di query a scelta tra quelle messe a disposizione dal query-set TPC-H Le misurazioni saranno di tipo comparativo, ovvero verrà eseguito lo stesso workload nei due ambienti, colonnare e row-based, e verranno confrontati i tempi di risposta ed il throughput complessivo In alcuni casi significativi, saranno mostrati i piani di accesso ai dati elaborati dall’ottimizzatore di DB2 Gli strumenti di iniezione e misurazione del carico saranno IBM Data Studio e Apache Jmeter 19 © 2015 IBM Corporation BACKUP SLIDES 20 IBM BDA offering overview © 2015 IBM Corporation Hong Kong’s first wireless commercial television station implements social media analytics to increase ratings “While traditional TV ratings research will continue to be important, it must be augmented by social media intelligence.” Unlock the value of customer sentiment in social media -- KC Leung, Senior Manager, Marketing Research and Information Department, TVB will mine more than three decades of program ratings data to understand the trends of media consumption habits. •IBM Social Media Analytics (SaaS) •IBM DB2 with BLU Acceleration •IBM Cognos BI, IBM DataStage •IBM Power Systems •IBM Storwize V7000 21 Learn more (Press) © 2015 IBM Corporation World-leading manufacturer of sensor solutions gained faster insight into markets and customers Faster insight into critical data for better business decisions “We cut report runtimes by up to 98 percent thanks to IBM DB2 with BLU Acceleration (on Power Systems) technology – without changing operations processes or investing in new hardware or software.” -- Bernhard Herzog, Team Manager Information Technology SAP, Balluff. Achieved 98% faster access to business data, 50% faster SAP ERP response times, 7x faster access to documents, and near real-time access to essential information. •IBM Power Systems •IBM PowerVM, PowerHA •IBM DB2 with BLU Acceleration •SAP Business Warehouse, ERP •IBM Storage & Services Learn more (Press, Case Study) 22 © 2015 IBM Corporation Where to learn more about Big Data & Analytics on IBM Power Systems Start the conversation with your IBM Representative or Business Partner Open innovation to put data to work across the enterprise Join Power Systems in social media! Connect with Power on Linkedin: bit.ly/poweronlinkedin Like us on Facebook: bit.ly/poweronfacebook Watch us on YouTube: bit.ly/poweronyoutube Follow us on Twitter: #PowerSystems, #OpenPower, #IBMBigData, #IBMAnalytics, #IBMWatson, #IBMBLU, #IBMCognos, #IBMSPSS, #IBMStream, #IBMBigInsights, #BobFriske www.ibm.com/power 23 © 2015 IBM Corporation