Conversation
|
Needs tabmat #502 and a new tabmat release. |
There was a problem hiding this comment.
Pull request overview
This pull request adds support for Polars and other dataframe libraries through the narwhals compatibility layer, enabling glum to work with multiple dataframe backends beyond pandas. It also implements backwards compatibility for pickled models from earlier glum versions.
Changes:
- Integrated narwhals library to support pandas, polars, and other dataframe backends
- Migrated from
feature_dtypes_to_categorical_levels_for categorical column tracking - Added backwards compatibility handling in
__setstate__for unpickling models from glum 3.0+ - Updated minimum Python version to 3.10 and bumped minimum dependency versions
Reviewed changes
Copilot reviewed 15 out of 18 changed files in this pull request and generated 12 comments.
Show a summary per file
| File | Description |
|---|---|
| setup.py | Updated Python version requirement to 3.10+, added narwhals dependency and version constraints for core dependencies |
| pyproject.toml | Updated mypy Python version target to 3.10 |
| pixi.toml | Updated Python version constraints and dependency versions, removed py39 environment |
| conda.recipe/recipe.yaml | Updated dependency versions in conda recipe |
| src/glum/_glm.py | Added narwhals support, implemented pickle compatibility via setstate, migrated to categorical_levels_, added deprecated feature_dtypes_ property |
| src/glum/_glm_cv.py | Changed copy_X parameter type to Optional[bool] |
| src/glum/_utils.py | Converted utility functions to work with narwhals DataFrames instead of pandas-only |
| src/glum/_validation.py | Updated array checking to use narwhals for dataframe detection |
| src/glum/_typing.py | Added IntoDataFrame and ShapedArrayLikeConverted type aliases |
| tests/glm/test_utils.py | Updated tests to use narwhals DataFrames |
| tests/glm/test_pickle_compatibility.py | New test file for pickle backwards compatibility |
| tests/glm/test_golden_master.py | Added polars parametrization for existing tests |
| tests/glm/test_glm_regressor.py | Added polars test cases alongside pandas tests |
| tests/glm/test_glm_base.py | Parametrized tests for both pandas and polars |
| tests/glm/test_formula.py | Added polars support to formula tests (with fixture issue) |
| tests/glm/pickles/*.pkl | Pickle files from glum v3.0 for compatibility testing |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
|
This is now ready for review. CI is failing because of an unrelated issue and should be fixed by #965 |
|
friendly ping @MarcAntoineSchmidtQC @jtilly |
MarcAntoineSchmidtQC
left a comment
There was a problem hiding this comment.
Looks good to me!
| selection: str = "cyclic", | ||
| random_state=None, | ||
| copy_X: bool = True, | ||
| copy_X: Optional[bool] = None, |
There was a problem hiding this comment.
This should probably be fine, but changing the default value is not backward compatible.
There was a problem hiding this comment.
Hm true. I can change the default to True if you'd like. Hopefully this change does not affect behavior other than potentially lower memory consumption (input will be copied whenever necessary, it's never modified in place).
There was a problem hiding this comment.
No need. I think this change is okay and shouldn't cause backward-incompatible changes. You can merge this whenever you are ready.
Belated follow up for tabmat #388. Solves #896.
Also implements @lbittarello's idea for backwards compatibility wrt. pickles (no guarantees though, let's do it on a best effort basis for the time being) and addresses #932.
Checklist
CHANGELOG.rstentry