GetAround is a peer-to-peer car rental platform where late vehicle returns can disrupt subsequent bookings, leading to customer dissatisfaction and cancellations.
This project tackles two key product challenges:
- Operational optimisation β analysing checkout delays and simulating buffer thresholds
- Pricing optimisation β deploying a Machine Learning model via a production API
| Service | Description | URL |
|---|---|---|
| π Delay Dashboard | Product analytics & delay simulation | https://huggingface.co/spaces/Dreipfelt/getaround-dashboard |
| π° Pricing Demo | UI for real-time price prediction | https://huggingface.co/spaces/Dreipfelt/Getaround-Pricing |
| π API | FastAPI prediction service | https://dreipfelt-getaround-api.hf.space |
| π API Docs | Interactive documentation | https://dreipfelt-getaround-api.hf.space/docs |
| π» GitHub | Source code repository | https://github.com/Data-Science-Designer-and-Developer/Project_GetAround |
- Measure late returns
- Simulate threshold strategies
- Optimise trade-off between blocked rentals and solved issues
- Predict rental price from vehicle features
- Serve model via API
- Enable real-time decision support
| Property | Value |
|---|---|
| Algorithm | XGBoost Regressor |
| Target | rental_price_per_day (β¬) |
| RMSE | 16.60 |
| MAE | 10.50 |
| RΒ² | 0.738 |
| CV RMSE | 16.86 |
| CV RMSE std | 1.27 |
| Number of features | 13 |
model_key, mileage, engine_power, fuel, paint_color, car_type,
private_parking_available, has_gps, has_air_conditioning,
automatic_car, has_getaround_connect, has_speed_regulator, winter_tires
curl -X POST "https://dreipfelt-getaround-api.hf.space/predict" \
-H "Content-Type: application/json" \
-d '{
"input": [[
"CitroΓ«n",
50000,
120,
"diesel",
"black",
"sedan",
1,
1,
1,
0,
1,
1,
0
]]
}'Response:
{
"prediction": [124.52]
}Project_GetAround/
β
βββ api/ # FastAPI application (model serving)
β βββ app.py # API endpoints (/predict)
β βββ Dockerfile # HF deployment config
β βββ pipeline.pkl # Trained ML pipeline
β βββ feature_names.json # Input feature order
β βββ model_metrics.json # Model performance metrics
β βββ requirements.txt
β
βββ delay_dashboard/ # Streamlit app (delay analysis)
β βββ app.py
β βββ requirements.txt
β
βββ pricing_demo/ # Streamlit app (price prediction UI)
β βββ app.py
β βββ requirements.txt
β
βββ notebooks/ # Data exploration & model training
β βββ 01_EDA_delays.ipynb
β βββ 02_ML_pricing.ipynb
β
βββ .gitignore
βββ requirements-dev.txt
βββ README.md
| Category | Tools |
|---|---|
| Language | Python 3.10 |
| Dashboard | Streamlit, Plotly |
| API | FastAPI, Uvicorn |
| Machine Learning | Scikit-learn, XGBoost |
| Deployment | Hugging Face Spaces |
| Version Control | Git, GitHub |
| Stage | Description | Estimated Duration |
|---|---|---|
| 1. Data Exploration | EDA on delays and pricing | ~3h |
| 2. Business Analysis | Threshold simulations, revenue impact, visualisations | ~3h |
| 3. Dashboard | Streamlit development and deployment | ~3h |
| 4. Machine Learning | Feature engineering, XGBoost training and evaluation | ~5h |
| 5. FastAPI API | Development of the /predict endpoint, Dockerfile | ~3h |
| 6. HF Spaces Deployment | Configuration, testing, production release | ~2h |
| Total | ~19h |
git clone https://github.com/Data-Science-Designer-and-Developer/Project_GetAround.git
cd Project_GetAroundcd api
pip install -r requirements.txt
uvicorn app:app --reloadcd delay_dashboard
streamlit run app.pycd pricing_demo
streamlit run app.pyThe analysis highlights a clear trade-off between operational efficiency and customer experience:
- Increasing the minimum buffer between rentals reduces conflicts
- However, it also increases the number of blocked bookings (lost revenue)
π Recommendation:
-
Set a default buffer around 60β90 minutes
-
This range provides a strong balance:
- significant reduction in conflicts
- limited impact on booking volume
π Scope recommendation:
- Apply stricter thresholds to Connect vehicles
- Keep more flexibility for Mobile check-ins
Why: Connect vehicles show more predictable behaviour, making stricter rules more effective with lower operational risk.
The Machine Learning model enables data-driven pricing decisions:
-
Price is strongly influenced by:
- vehicle type
- engine power
- equipment (GPS, AC, etc.)
-
Significant price variability exists for similar vehicles
π Recommendation:
-
Use the model as a price recommendation tool for owners
-
Integrate it directly into the platform to:
- standardise pricing
- reduce underpricing / overpricing
- improve marketplace consistency
By combining both solutions, GetAround can:
- Reduce customer friction and cancellations
- Improve fleet utilisation
- Increase owner revenue through better pricing
- Enable data-driven product decisions
To maximise impact, the following improvements are recommended:
- Introduce dynamic buffer thresholds (adaptive to context: location, demand, vehicle type)
- Monitor A/B test performance of different delay strategies
- Continuously retrain the pricing model with fresh data
- Integrate both tools into a unified internal product dashboard
This project demonstrates how combining product analytics and machine learning can directly support strategic decisions and improve both user experience and business performance.
GetAround is losing customers due to late returns between consecutive bookings. This project addresses two concrete product questions:
What minimum buffer time should be enforced between two bookings to reduce conflicts without significantly impacting revenue? β Analysis of 2017 data shows that a 60 to 90-minute buffer significantly reduces problematic cases while limiting the impact on revenue. How can owners optimise their listed prices? β An XGBoost model (RΒ² = 0.74) predicts the optimal daily price based on vehicle characteristics, with a median error of approximately β¬10 per day.
Both tools are deployed in production and accessible via the links below.
This project is carried out in an educational context using datasets provided by GetAround to Jedha Bootcamp.
Nature of the data: The datasets used (get_around_delay_analysis.xlsx, get_around_pricing_project.csv) are pseudonymised: no names, email addresses, phone numbers, or direct identifiers of drivers or owners are included. Rental identifiers (rental_id, car_id) are technical keys with no link to identifiable individuals.
Legal basis for processing: Internal analysis for service improvement purposes (legitimate interest β Art. 6(1)(f) GDPR). The data is not collected as part of this project but reused for analytical purposes.
Storage: The deployed API does not store any data transmitted through /predict requests. No personal data is retained on the server side.
Data subject rights: GetAround users have rights of access, rectification, and erasure directly with GetAround (the data controller). This project does not act as a data controller.
FrΓ©dΓ©ric Tellier
CDSD Candidate β Data Scientist
Jedha Bootcamp
LinkedIn: https://www.linkedin.com/in/frΓ©dΓ©ric-tellier-8a9170283/
Portfolio: https://github.com/Dreipfelt
CDSD Certification Project β Bloc 5 - Data Science Designer & Developer (RNCP35288)