Hardware-conscious data processing systems
Speaker: Holger Pirk, MIT
Location: 60 Fifth Avenue 150
Date: February 3, 2017, 11:30 a.m.
Host: Subhash Khot
Performance engineering, i.e., the processes of tuning the implementation of an algorithm for a given set of hardware, application and data characteristics, can reduce query reponse times of data processing systems from minutes to milliseconds -- it turns long-running jobs into interactive queries. However, when building such systems, performance is often at odds with other factors such as implementation effort, ease of use and maintainability. Well-designed programming abstractions are essential to allow the creation of systems that are fast, easy to use and maintainable.
In my talk, I demonstrate how existing frameworks for high-performance, data-parallel programming fall short of this goal. I argue that the poor performance of many mainstream data processing systems is due to the lack of an appropriate intermediate abstraction layer, i.e., one that allows the hardware and data-conscious application of state-of-the-art low level optimizations.
To address this problem, I introduce Voodoo, a data parallel intermediate language that is abstract enough to allow effective code generation and optimization but low-level enough to express many common optimizations such as parallelization, loop tiling or memory locality optimizations. I demonstrate how we used Voodoo to build a relational data processing system that outperforms the fastest state-of-the-art in-memory database systems by up to five times. I also demonstrate how Voodoo can be used as a performance engineering framework, allowing the expression of many known optimizations and even enabling the discovery of entirely new optimizations.
Holger is a Postdoc at the Databases group at MIT CSAIL. He spent his PhD years in the Database Architectures group at CWI in Amsterdam resulting in a PhD from the University of Amsterdam in 2015. His research interests lie in the optimization of data processing systems for modern hardware. In particular, he studies the efficient use of CPU features such as speculative execution, SIMD or transactional memory as well as emerging technologies such as GPGPUs and flash memory for analytical data processing. In addition to new algorithms and optimizations, Holger develops abstractions that allow the effective use of these low-level techniques in data processing systems.
Refreshments will be offered starting 15 minutes prior to the scheduled start of the talk.