EECS Faculty Candidate Seminar- Dr. Jia Zou
PlinyCompute: Connecting Programming, Computation, and Storage for Big Data Analytics
Dr. Jia Zou, Rice University
11:00am, Thursday, February 28
Min H. Kao room 435
Users want Big Data analytics systems that provide interactive-speed ad-hoc query processing and short training times for machine learning. But the performance of existing systems is not always great. In this talk, I identify two reasons for this. First, such systems are heavily layered, with many separate softwares working together: a distributed file system, an in-memory file system, the JVM, and the computational system itself. Communication across layers leads to inefficiencies. Also, it is difficult to automatically optimize computations residing in opaque user codes, such as user defined functions (UDFs).
In this talk, I will describe my work aimed at solving those problems. First, I will present a novel declarative programming interface, based on lambda calculus, that forces programmers to expose intent and compiles into a standalone, intermediate representation of computations that facilitates relational-style query optimization and automatic data placement. Second, I will describe a novel storage system that avoids the layering overhead by pushing down analytics computations and managing all analytics data in disk and memory in a monolithic distributed system. In the end, I will describe my on-going work and future research plan for building a novel D3 big data analytics platform to provide Declarative programming, Deterministic performance and Dynamic interaction with edge devices, human in the loop and environment simulators.
Jia Zou is a Research Scientist in the Department of Computer Science at Rice University. Prior to join Rice in 2015, she worked in IBM Research China as a Research Staff Member. She received her Ph.D degree from Tsinghua University, China in 2008. Her research investigates and builds high performance and scalable systems for Big Data management and analytics, which has led to an open source system called PlinyCompute and publications in top Big Data management venues, including VLDB and SIGMOD. She mentors undergraduate students, graduate students and high school student for their research works. She also serves the TPC member of Cluster 2018 and has reviewed more than 40 papers for IEEE Transactions on Parallel and Distributed Systems (TPDS), IEEE Transactions on Knowledge and Data Engineering (TKDE) and so on.
Thursday, February 28, 2019 at 11:00am to 12:00pm
Min H. Kao Electrical Engineering and Computer Science, 435
1520 Middle Drive, Knoxville, TN 37996