This project presents a comprehensive food safety inspection analytics platform for restaurants and other food establishments in Chicago and Dallas, developed using publicly available government datasets. The objective is to empower public health officials, analysts, and decision-makers with actionable insights into:
- Inspection outcomes
- Risk categories
- Violation patterns
By leveraging the medallion architecture (bronze → silver → gold) for data organization and analytics, the platform delivers clean, reliable, and structured data pipelines that drive impactful visualizations and decision support.
- Azure Data Factory – Orchestration for data cleaning
- Azure Data Lake – Raw and curated data storage (bronze & silver layers)
- ER Studio – Dimensional data modeling
- Databricks – Data Transformation and staging using PySpark notebooks
- Snowflake – Data warehousing and analytical querying (gold layer)
- Tableau – Dashboards for data visualization and insights
- Microsoft Excel – Initial data exploration
- Inspections of restaurants and other food establishments conducted by the Chicago Department of Public Health’s Food Protection Program
- Reviewed by licensed LEHPs (State of Illinois)
- Link to Dataset
- Conducted by the Dallas Code Compliance Services Department
- Link to Dataset
Link to Dataset Mapping Document
- Raw Source Files (CSV) ↓ [Bronze Layer] → Ingested raw data (Azure Data Lake) ↓ [Silver Layer] → Cleaned, deduplicated parquet files (Databricks) ↓ [Gold Layer] → Transformed facts and dimensions (Snowflake)
- Use Azure Data Factory (ADF) pipeline to:
- Extract raw data from source containers
- Clean and store as Parquet files in the Silver Layer
- Open Databricks and run the following notebooks in order:
- Reads Chicago Parquet from silver layer
- Applies transformations, standardization, flatten
- Writes to stg_chicago table in Snowflake
- Similar processing for Dallas
- Outputs to stg_dallas table in Snowflake
- Combines both staging tables
- Adds
sourcecolumn to identify origin - Generates UUID-based
job_idand date forload_dt - Loads to the final combined stage table in Snowflake:
stg_final_table
- From the combined stage table, load into Snowflake dimensions and facts:
- Run the
Dynamic_Tables_Query.sqlto load data into dimensions and facts
-
Violation analysis dashboard
- Inspection analysis dashboard
These dashboards help surface hotspots, compliance patterns, and risk zones to city officials.
- Built with modular, scalable architecture using medallion pattern
- All records tagged with
job_id,load_dt, andsourcefor auditability - Dynamic Tables for dimensions and facts
- Visualization of food inspection results
- Muskan Deepak Raisinghani
- Rachana Keshav
- Rishabh Shah

