Kexin Rong

I am a Ph.D. student in Computer Science at Stanford University, co-advised by Prof. Peter Bailis and Prof. Philip Levis. I am affiliated with the Stanford DAWN project. I am broadly interested in designing and building systems to enable data analytics at scale, supporting applications such as scientific analysis, infrastructure monitoring, and analytical queries on big-data clusters.

Previously, I received my bachelor's degree in Computer Science from Caltech (2015). I've also spend time at the DMX group at Microsoft Research in Redmond (2019).

Email  /  Google Scholar  /  CV  /  Github

Approximate Partition Selection for Big-Data Workloads using Summary Statistics
Kexin Rong, Yao Lu, Peter Bailis, Srikanth Kandula, Philip Levis
VLDB, 2020

A system that leverages summary statistics to select weighted, partition-level samples to approximate analytical queries on big-data clusters.

Rehashing Kernel Evaluation in High Dimensions
Paris Siminelakis*, Kexin Rong*, Peter Bailis, Moses Charikar, Philip Levis.
ICML, 2019 (Long talk)
[blog] [code] [supplementary]

LSH-based sketching and importance sampling algorithms to accelerate kernel evaluation in high dimensions.

Locality-Sensitive Hashing for Earthquake Detection: A Case Study of Scaling Data-Driven Science
Kexin Rong, Clara Yoon, Karianne Bergen, Hashem Elezabi, Peter Bailis, Philip Levis, Gregory Beroza.
VLDB, 2018
[blog] [video] [code] [seismology paper]

An unsupervised, end-to-end earthquake detection pipeline based on pairwise similarity search on seismic waveforms.

ASAP: Prioritizing Attention via Time Series Smoothing
Kexin Rong, Peter Bailis.
VLDB, 2017
[Datadog blog] [blog] [demo] [talk] [slides] [code]

An automatic smoothing algorithm for time series visualization that removes short-term fluctuations while preserving large-scale deviations.

MacroBase: Prioritizing Attention in Fast Data
Peter Bailis, Edward Gan, Samuel Madden, Deepak Narayanan, Kexin Rong, Sahaana Suri.
SIGMOD, 2017 (Invited to ACM TODS "Best of SIGMOD 2017" Special Issue.)
[website] [code] [journal paper] [vision paper] [demo paper]

A data analytics engine that highlights and aggregates important and unusual behavior in high-volume fast data streams.

Stanford CS197: Computer Science Research, Fall 2019 - Teaching Assistant
Stanford CS 161: Design and Analysis of Algorithms, Summer 2018 - Teaching Assistant
Caltech CS 122: Relational Database Implementation, Winter 2015 - Teaching Assistant
Caltech CS 24: Introduction to Computing Systems, Spring 2014 - Teaching Assistant
Caltech CS 1: Introduction to Computer Programming, Fall 2013 - Teaching Assistant
Caltech CS 1: Introduction to Computer Programming, Fall 2012 - Teaching Assistant

Template Source.