Welcome!

The world is increasingly driven by data, but turning large amounts of raw data into a few actionable insights often requires a team of well-trained engineers. The goal of the D2I lab (part of the Georgia Tech database group) is to 1) design and build systems and tools to help simplify data science at scale for non-experts and 2) train world-class researchers.

Our research focuses on building high-performance, user-friendly data systems to enable next-generation AI applications. Current research directions include:

  • Analytics over unstructured data
    • Video Labeling (e.g., SketchQL)
    • Image Understanding (e.g., VCR)
  • Data Infrastructure for AI:
    • Data preprocessing and cleaning for ML (e.g., LOTUS, DiffPrep)
    • Checkpointing for DL training (e.g., Inshrinkerator)
    • Vector databases and RAG systems

News

  • [Oct 2024] Our paper CanDE has been accepted to IEEE BigData’24.
  • [Sep 2024] Our paper Inshrinkerator has been accepted to SoCC’24.
  • [Sep 2024] Our paper Lotus won a 🏆 Best Paper Nomination in IISWC’24! Congrats Rajveer!
  • [July 2024] Our paper Lotus has been accepted to IISWC’24.
  • [July 2024] Dristi Shah received a VLDB 2024 Travel Award. Congrats Dristi!
  • [June 2024] Congrats to Peng Li for winning the 🏆 SIGMOD Research Highlight Awards at SIGMOD’24!
  • [June 2024] Kexin received a SIGMOD 2024 Distinguished PC Award.
  • [May 2024] SketchQL and VCR demo papers accepted at VLDB’24.
  • [Apr 2024] 🎓 Hantian Zhang defended his thesis! Congrats Hantian!
  • [Apr 2024] Kexin received an Amazon Research Award for optimizing dynamic layout designs in data analytics systems.
  • [Apr 2024] Kexin received an NSF award to reimagine video moment retrieval with hand-drawn sketches.
  • [Mar 2024] 🎓 Renzhi Wu defended his thesis! Congrats Renzhi!
  • [Mar 2024] Our paper OREO has been accepted to ICDE’24.
  • [Mar 2024] Our paper SketchQL has been accepted to SIGMOD’25 (the only paper accepted in the first round without revision).
  • [Dec 2023] Our paper FALCON has been accepted to VLDB’24.
  • [Dec 2023] 🎓 Peng Li defended his thesis! Congrats Peng!
  • [Nov 2023] Hantian Zhang passed his thesis proposal!
  • [Nov 2023] Kexin received an NSF award to build a person-focused open knowledge graph. Learn more about our story here.
  • [Nov 2023] Rajveer Bachkaniwala received an honorable mention for the best poster award (among 86 posters!) at the PRISM annual retreat.
  • [Oct 2023] Thanks Bosch Research for supporting our work!
  • [Oct 2023] Congrats to Hantian Zhang for winning the 🏆 Chih Foundation Graduate Student Research Publication Awards!
  • [Aug 2023] Kexin was recognized as a distinguished reviewer for PVLDB Vol16.
  • [Aug 2023] Congrats to Peng Li for winning the 🏆 Best Research Paper Award at VLDB’23!!
  • [Oct 2022] Thanks Bosch Research for supporting our work!
  • [Aug 2022] Kexin received the Catherine M. and James E. Allchin Early Career Professorship in the College of Computing.
  • [Jun 2022] Kexin received an 🏆 Honorable Mention for the 2022 SIGMOD Jim Gray Doctoral Dissertation Award.

Join

We are always looking for talented and motivated students who want to help push forward the agenda of democratizing data analytics. If you are a GT PhD student, please email us directly. If you are an undergraduate or master student, please fill out our research questionnaire.