Skip to content

Support polars, etc. through narwhals#957

Open
stanmart wants to merge 28 commits intomainfrom
narwhals-new
Open

Support polars, etc. through narwhals#957
stanmart wants to merge 28 commits intomainfrom
narwhals-new

Conversation

@stanmart
Copy link
Collaborator

@stanmart stanmart commented Nov 18, 2025

Belated follow up for tabmat #388. Solves #896.

Also implements @lbittarello's idea for backwards compatibility wrt. pickles (no guarantees though, let's do it on a best effort basis for the time being) and addresses #932.

Checklist

  • Added a CHANGELOG.rst entry

@stanmart stanmart changed the title Support polars, etc. through narwhals Support polars, etc. through narwhals [\WIP] Nov 18, 2025
@stanmart stanmart changed the title Support polars, etc. through narwhals [\WIP] Support polars, etc. through narwhals Nov 18, 2025
@stanmart stanmart self-assigned this Nov 25, 2025
@stanmart
Copy link
Collaborator Author

stanmart commented Jan 8, 2026

Needs tabmat #502 and a new tabmat release.

@stanmart stanmart requested a review from Copilot January 12, 2026 16:30
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request adds support for Polars and other dataframe libraries through the narwhals compatibility layer, enabling glum to work with multiple dataframe backends beyond pandas. It also implements backwards compatibility for pickled models from earlier glum versions.

Changes:

  • Integrated narwhals library to support pandas, polars, and other dataframe backends
  • Migrated from feature_dtypes_ to _categorical_levels_ for categorical column tracking
  • Added backwards compatibility handling in __setstate__ for unpickling models from glum 3.0+
  • Updated minimum Python version to 3.10 and bumped minimum dependency versions

Reviewed changes

Copilot reviewed 15 out of 18 changed files in this pull request and generated 12 comments.

Show a summary per file
File Description
setup.py Updated Python version requirement to 3.10+, added narwhals dependency and version constraints for core dependencies
pyproject.toml Updated mypy Python version target to 3.10
pixi.toml Updated Python version constraints and dependency versions, removed py39 environment
conda.recipe/recipe.yaml Updated dependency versions in conda recipe
src/glum/_glm.py Added narwhals support, implemented pickle compatibility via setstate, migrated to categorical_levels_, added deprecated feature_dtypes_ property
src/glum/_glm_cv.py Changed copy_X parameter type to Optional[bool]
src/glum/_utils.py Converted utility functions to work with narwhals DataFrames instead of pandas-only
src/glum/_validation.py Updated array checking to use narwhals for dataframe detection
src/glum/_typing.py Added IntoDataFrame and ShapedArrayLikeConverted type aliases
tests/glm/test_utils.py Updated tests to use narwhals DataFrames
tests/glm/test_pickle_compatibility.py New test file for pickle backwards compatibility
tests/glm/test_golden_master.py Added polars parametrization for existing tests
tests/glm/test_glm_regressor.py Added polars test cases alongside pandas tests
tests/glm/test_glm_base.py Parametrized tests for both pandas and polars
tests/glm/test_formula.py Added polars support to formula tests (with fixture issue)
tests/glm/pickles/*.pkl Pickle files from glum v3.0 for compatibility testing

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

stanmart and others added 4 commits January 15, 2026 08:59
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@stanmart stanmart marked this pull request as ready for review January 15, 2026 10:42
@stanmart
Copy link
Collaborator Author

This is now ready for review. CI is failing because of an unrelated issue and should be fixed by #965

@stanmart
Copy link
Collaborator Author

friendly ping @MarcAntoineSchmidtQC @jtilly

Copy link
Member

@MarcAntoineSchmidtQC MarcAntoineSchmidtQC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

selection: str = "cyclic",
random_state=None,
copy_X: bool = True,
copy_X: Optional[bool] = None,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be fine, but changing the default value is not backward compatible.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm true. I can change the default to True if you'd like. Hopefully this change does not affect behavior other than potentially lower memory consumption (input will be copied whenever necessary, it's never modified in place).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need. I think this change is okay and shouldn't cause backward-incompatible changes. You can merge this whenever you are ready.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Error uploading previous models Feature names aren't set if fit on polars.DataFrame

3 participants