Python Data Mining & Applied Algorithms

This repository contains Python implementations of advanced data mining algorithms, machine learning pipelines, and high-performance recommender systems.

Core Implementations & Performance

Recommender Systems (02_recommender_systems_svd.ipynb): Engineered scalable User Collaborative Filtering and Singular Value Decomposition (SVD) models.
Performance Optimization: Achieved sub-5-second execution times on 100k+ row datasets by strictly utilizing sparse matrices (scipy.sparse.csr_matrix) and vectorized mathematical operations via NumPy, successfully eliminating iterative Python loops.
Machine Learning & NLP (03_clustering_nlp_pipelines.ipynb): Implemented classification pipelines utilizing TF-IDF and pre-trained Word2Vec models for NLP feature extraction. Designed clustering models (K-means, Agglomerative) using Silhouette coefficients.
Data Engineering (01_data_preprocessing_eda.ipynb): Cleaned and transformed raw, high-dimensional data using Pandas for downstream machine learning tasks.

Technology Stack

Language: Python
Libraries: Pandas, NumPy, SciPy, Scikit-learn, Matplotlib

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
01_data_preprocessing_eda.ipynb		01_data_preprocessing_eda.ipynb
02_recommender_systems_svd.ipynb		02_recommender_systems_svd.ipynb
03_clustering_nlp_pipelines.ipynb		03_clustering_nlp_pipelines.ipynb
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Python Data Mining & Applied Algorithms

Core Implementations & Performance

Technology Stack

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Python Data Mining & Applied Algorithms

Core Implementations & Performance

Technology Stack

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages