Skip to content

Data-Science-Designer-and-Developer/Project_GetAround

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

26 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸš— GetAround β€” Delay Analysis & Pricing Prediction


πŸ“Œ Project Overview

GetAround is a peer-to-peer car rental platform where late vehicle returns can disrupt subsequent bookings, leading to customer dissatisfaction and cancellations.

This project tackles two key product challenges:

  • Operational optimisation β€” analysing checkout delays and simulating buffer thresholds
  • Pricing optimisation β€” deploying a Machine Learning model via a production API

πŸ”— Live Applications

Service Description URL
πŸ“Š Delay Dashboard Product analytics & delay simulation https://huggingface.co/spaces/Dreipfelt/getaround-dashboard
πŸ’° Pricing Demo UI for real-time price prediction https://huggingface.co/spaces/Dreipfelt/Getaround-Pricing
πŸ”Œ API FastAPI prediction service https://dreipfelt-getaround-api.hf.space
πŸ“„ API Docs Interactive documentation https://dreipfelt-getaround-api.hf.space/docs
πŸ’» GitHub Source code repository https://github.com/Data-Science-Designer-and-Developer/Project_GetAround

🎯 Business Objectives

Delay Management

  • Measure late returns
  • Simulate threshold strategies
  • Optimise trade-off between blocked rentals and solved issues

Pricing Optimisation

  • Predict rental price from vehicle features
  • Serve model via API
  • Enable real-time decision support

πŸ€– Machine Learning API

Property Value
Algorithm XGBoost Regressor
Target rental_price_per_day (€)
RMSE 16.60
MAE 10.50
RΒ² 0.738
CV RMSE 16.86
CV RMSE std 1.27
Number of features 13

Features

model_key, mileage, engine_power, fuel, paint_color, car_type,
private_parking_available, has_gps, has_air_conditioning,
automatic_car, has_getaround_connect, has_speed_regulator, winter_tires

πŸ”Œ API Endpoint

POST /predict

curl -X POST "https://dreipfelt-getaround-api.hf.space/predict" \
-H "Content-Type: application/json" \
-d '{
  "input": [[
    "CitroΓ«n",
    50000,
    120,
    "diesel",
    "black",
    "sedan",
    1,
    1,
    1,
    0,
    1,
    1,
    0
  ]]
}'

Response:

{
  "prediction": [124.52]
}

πŸ—‚οΈ Repository Structure

Project_GetAround/  
β”‚  
β”œβ”€β”€ api/                          # FastAPI application (model serving)  
β”‚   β”œβ”€β”€ app.py                    # API endpoints (/predict)  
β”‚   β”œβ”€β”€ Dockerfile                # HF deployment config  
β”‚   β”œβ”€β”€ pipeline.pkl              # Trained ML pipeline  
β”‚   β”œβ”€β”€ feature_names.json        # Input feature order  
β”‚   β”œβ”€β”€ model_metrics.json        # Model performance metrics  
β”‚   └── requirements.txt  
β”‚  
β”œβ”€β”€ delay_dashboard/              # Streamlit app (delay analysis)  
β”‚   β”œβ”€β”€ app.py
β”‚   └── requirements.txt
β”‚  
β”œβ”€β”€ pricing_demo/                 # Streamlit app (price prediction UI)  
β”‚   β”œβ”€β”€ app.py  
β”‚   └── requirements.txt  
β”‚  
β”œβ”€β”€ notebooks/                    # Data exploration & model training  
β”‚   β”œβ”€β”€ 01_EDA_delays.ipynb  
β”‚   └── 02_ML_pricing.ipynb  
β”‚  
β”œβ”€β”€ .gitignore  
β”œβ”€β”€ requirements-dev.txt  
└── README.md  

πŸ› οΈ Tech Stack

Category Tools
Language Python 3.10
Dashboard Streamlit, Plotly
API FastAPI, Uvicorn
Machine Learning Scikit-learn, XGBoost
Deployment Hugging Face Spaces
Version Control Git, GitHub

πŸ“… Project Timeline

Stage Description Estimated Duration
1. Data Exploration EDA on delays and pricing ~3h
2. Business Analysis Threshold simulations, revenue impact, visualisations ~3h
3. Dashboard Streamlit development and deployment ~3h
4. Machine Learning Feature engineering, XGBoost training and evaluation ~5h
5. FastAPI API Development of the /predict endpoint, Dockerfile ~3h
6. HF Spaces Deployment Configuration, testing, production release ~2h
Total ~19h

βš™οΈ Local Setup

git clone https://github.com/Data-Science-Designer-and-Developer/Project_GetAround.git
cd Project_GetAround

Run API

cd api
pip install -r requirements.txt
uvicorn app:app --reload

Run dashboards

cd delay_dashboard
streamlit run app.py
cd pricing_demo
streamlit run app.py

πŸ“ˆ Business Recommendation

1. Delay Management Strategy

The analysis highlights a clear trade-off between operational efficiency and customer experience:

  • Increasing the minimum buffer between rentals reduces conflicts
  • However, it also increases the number of blocked bookings (lost revenue)

πŸ‘‰ Recommendation:

  • Set a default buffer around 60–90 minutes

  • This range provides a strong balance:

    • significant reduction in conflicts
    • limited impact on booking volume

πŸ‘‰ Scope recommendation:

  • Apply stricter thresholds to Connect vehicles
  • Keep more flexibility for Mobile check-ins

Why: Connect vehicles show more predictable behaviour, making stricter rules more effective with lower operational risk.


2. Pricing Strategy

The Machine Learning model enables data-driven pricing decisions:

  • Price is strongly influenced by:

    • vehicle type
    • engine power
    • equipment (GPS, AC, etc.)
  • Significant price variability exists for similar vehicles

πŸ‘‰ Recommendation:

  • Use the model as a price recommendation tool for owners

  • Integrate it directly into the platform to:

    • standardise pricing
    • reduce underpricing / overpricing
    • improve marketplace consistency

3. Product Impact

By combining both solutions, GetAround can:

  • Reduce customer friction and cancellations
  • Improve fleet utilisation
  • Increase owner revenue through better pricing
  • Enable data-driven product decisions

4. Next Steps

To maximise impact, the following improvements are recommended:

  • Introduce dynamic buffer thresholds (adaptive to context: location, demand, vehicle type)
  • Monitor A/B test performance of different delay strategies
  • Continuously retrain the pricing model with fresh data
  • Integrate both tools into a unified internal product dashboard

5. Key Takeaway

This project demonstrates how combining product analytics and machine learning can directly support strategic decisions and improve both user experience and business performance.


🎯 Executive Summary

GetAround is losing customers due to late returns between consecutive bookings. This project addresses two concrete product questions:

What minimum buffer time should be enforced between two bookings to reduce conflicts without significantly impacting revenue? β†’ Analysis of 2017 data shows that a 60 to 90-minute buffer significantly reduces problematic cases while limiting the impact on revenue. How can owners optimise their listed prices? β†’ An XGBoost model (RΒ² = 0.74) predicts the optimal daily price based on vehicle characteristics, with a median error of approximately €10 per day.

Both tools are deployed in production and accessible via the links below.


πŸ”’ GDPR Compliance

This project is carried out in an educational context using datasets provided by GetAround to Jedha Bootcamp.

Nature of the data: The datasets used (get_around_delay_analysis.xlsx, get_around_pricing_project.csv) are pseudonymised: no names, email addresses, phone numbers, or direct identifiers of drivers or owners are included. Rental identifiers (rental_id, car_id) are technical keys with no link to identifiable individuals.

Legal basis for processing: Internal analysis for service improvement purposes (legitimate interest β€” Art. 6(1)(f) GDPR). The data is not collected as part of this project but reused for analytical purposes.

Storage: The deployed API does not store any data transmitted through /predict requests. No personal data is retained on the server side.

Data subject rights: GetAround users have rights of access, rectification, and erasure directly with GetAround (the data controller). This project does not act as a data controller.


πŸ‘€ Author

FrΓ©dΓ©ric Tellier
CDSD Candidate β€” Data Scientist
Jedha Bootcamp

LinkedIn: https://www.linkedin.com/in/frΓ©dΓ©ric-tellier-8a9170283/
Portfolio: https://github.com/Dreipfelt

CDSD Certification Project β€” Bloc 5 - Data Science Designer & Developer (RNCP35288)

About

Step into our shoes and tackle a real 2017 challenge: finding the perfect buffer time between rentals to prevent late returns without sacrificing revenue. Build data insights, a decision dashboard, and a pricing prediction API to help product leaders choose the optimal threshold and scope with confidence.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors