Link Search Menu Expand Document

Assignment 1: Technology Review

DUE MONDAY JANUARY 27 at 11:59PM.

Table of contents


In this assignment, you will survey 3 data management technologies of your choice that are shaping today’s enterprise landscape. You will select these technologies from the categories provided below. Your selections must cover at least 2 out of the 3 groups (A, B, C) and will help inform group formation for the technology presentation assignment.

Please use the assignment template.

List of Technologies

A. SPECIALIZED DATABASES

  • Key-Value Stores (e.g., Amazon DynamoDB, Redis)
  • Wide-column Stores (e.g., Google Bigtable, Cassandra)
  • Document Databases (e.g., MongoDB, CouchDB, Firebase)
  • Graph Databases (e.g., Neo4j, Amazon Neptune)
  • Time-Series Databases (e.g., Monarch, Gorilla, InfluxDB, TimescaleDB)
  • Vector Databases (e.g., Milvus, Pinecone, Weaviate)
  • HTAP Databases (e.g., TiDB, F1 Lightning)
  • Cloud Databases
    • Focus on comparing different options for running databases in the cloud, such as Infrastructure as a Service, Platform as a Service, Database as a Service, and Serverless Databases
  • ERP (enterprise resource planning) systems (e.g., SAP, Microsoft Dynamics 365, Sage)
  • CRM (customer relationship management) products (e.g., Salesforce, Siebel, PeopleSoft)
  • Data Lakes and Data Warehouses (e.g., Snowflake, Delta Lake, Google BigQuery)
  • Security, Privacy and Governance
    • Indentity Management (e.g., Okta, RSA SecurID)
    • Data Security Platforms (e.g., Privacera, Immuta)
    • Data Catalog (e.g., Alation, Collibra)

C. DATA MANAGEMENT TECHNOLOGIES

  • Data Visualization:
    • Visualization tools (e.g., Power BI, Tableau)
    • Visualization libraries (e.g., D3.js, Plotly.js, Vega-Lite)
  • Data Cleaning and Integration (e.g., HoloClean, Trifacta, tamr)
  • Data Labeling Platforms (e.g., Snorkel AI, Scale AI)
  • ML for Databases (choose one of the following):
    • Learned indexes: e.g., 1, 2
    • Learned query optimization: e.g., 1, 2
    • Learned database tuning: e.g., e.g., 1, 2

Assignment Requirements

Review the technologies in order of your preference. For each technology, include the following discusions:

1) Overview: Provide a high-level summary of the technology.

2) Product Analysis: Survey at least two representative commercial products/systems for the technology. You can also analyze research systems if commerical products are not available. For each product/system, briefly dicuss their target use cases, core functionalities, as well as any relevant architecture or implementation detail (5~7 sentences for each product).

3) Research: Analyze one research paper from a top-tier venue that is related to the technology. You are welcomed to use reference papers provided in the technology list. Explain the paper’s key contributions (can usually be found in the introduction of the paper) and their relevance to your chosen technology. Please use the APA format for your citations.

Note: Please refer to this resource page for guidelines on finding high quality database publications. Blog posts, white papers, or web articles are not acceptable substitutes.

4) Technology Assessment: In a few sentences, evaluate the current maturity of the technology. Discuss major technical limitaions and challenges, if any.

Submission

Fill out this survey to indicate your top three choices of technologies.

Submit a PDF with your technology review on canvas. The submission should not exceed 6 pages (excluding references) using the given template.

Grading

We will use the following criteria for grading:

  • Understanding (4pt): Discussion demonstrated general understanding of the technology as well as clear assessment of technology maturity and limitations.
  • Product Survey (8pt): Provide use cases, core functionality, and implementation details for 2 commerical/research products/systems for each topic.
  • Research Survey (4pt): Paper is selected from top-tier venue with proper APA citation format. Clear explanation of paper’s key contributions and relevant connection between paper and chosen technology.
  • Clarity (4pt): The writing is overall clear and easy to follow for a general audience in the field.

This is an individual assignment and it is worth 5% of your grade.