Skip to content

Comments

Fix Wilkinson formula term sorting to match MATLAB left-to-right hierarchy#396

Open
Pasta-coder wants to merge 4 commits intognu-octave:mainfrom
Pasta-coder:GAM_1
Open

Fix Wilkinson formula term sorting to match MATLAB left-to-right hierarchy#396
Pasta-coder wants to merge 4 commits intognu-octave:mainfrom
Pasta-coder:GAM_1

Conversation

@Pasta-coder
Copy link
Contributor

@Pasta-coder Pasta-coder commented Feb 22, 2026

Fixes #395

When i was integrating parseWilkinsonFormula into the ClassificationGAM class to replace its hardcoded formula parsing, a unit test revealed a discrepancy in how formula terms are sorted compared to MATLAB.

When expanding the crossing operator (*), Octave currently sorts the binary schema matrix in ascending order. Because 0 is less than 1, sortrows places variables with later indices before earlier ones.

for the following MATLAB script

% MATLAB script to verify formula expansion and sorting
x1 = rand(10,1); x2 = rand(10,1); x3 = rand(10,1); Y = rand(10,1);
T = table(x1, x2, x3, Y);

mdl = fitlm(T, 'Y ~ x1 * x2 * x3');

expected_terms = [
0, 0, 0, 0; % Intercept
1, 0, 0, 0; % x1
0, 1, 0, 0; % x2
0, 0, 1, 0; % x3
1, 1, 0, 0; % x1:x2
1, 0, 1, 0; % x1:x3
0, 1, 1, 0; % x2:x3
1, 1, 1, 0 % x1:x2:x3
];

assert(isequal(mdl.Formula.Terms, expected_terms), 'Matrix sorting does not match expected output!');
disp('SUCCESS: The matrix is correctly sorted left-to-right!');

MATLAB output

>> untitled4
SUCCESS: The matrix is correctly sorted left-to-right!

Octave before FIX

octave:43> schema = parseWilkinsonFormula('Y ~ x1 * x2 * x3', 'matrix');
expected_terms = [
0, 0, 0, 0; % Intercept
0, 1, 0, 0; % x1
0, 0, 1, 0; % x2
0, 0, 0, 1; % x3
0, 1, 1, 0; % x1:x2
0, 1, 0, 1; % x1:x3
0, 0, 1, 1; % x2:x3
0, 1, 1, 1 % x1:x2:x3
];
assert (schema.Terms, expected_terms);
disp('SUCCESS: The matrix is correctly sorted left-to-right!');
error: ASSERT errors for: assert (schema.Terms,expected_terms)

Location | Observed | Expected | Reason
(2,2) 0 1 Abs err 1 exceeds tol 0 by 1
(4,2) 1 0 Abs err 1 exceeds tol 0 by 1
(5,2) 0 1 Abs err 1 exceeds tol 0 by 1
(7,2) 1 0 Abs err 1 exceeds tol 0 by 1
(2,4) 1 0 Abs err 1 exceeds tol 0 by 1
(4,4) 0 1 Abs err 1 exceeds tol 0 by 1
(5,4) 1 0 Abs err 1 exceeds tol 0 by 1
(7,4) 0 1 Abs err 1 exceeds tol 0 by 1

after FIX

octave:47> clear classes; rehash;
octave:48> schema = parseWilkinsonFormula('Y ~ x1 * x2 * x3', 'matrix');
expected_terms = [
  0, 0, 0, 0;  % Intercept
  0, 1, 0, 0;  % x1
  0, 0, 1, 0;  % x2
  0, 0, 0, 1;  % x3
  0, 1, 1, 0;  % x1:x2
  0, 1, 0, 1;  % x1:x3
  0, 0, 1, 1;  % x2:x3
  0, 1, 1, 1   % x1:x2:x3
];
assert (schema.Terms, expected_terms);
disp('SUCCESS: The matrix is correctly sorted left-to-right!');
SUCCESS: The matrix is correctly sorted left-to-right!
octave:52> clear classes; rehash;
octave:53> test parseWilkinsonFormula
PASSES 75 out of 75 tests
octave:54> 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Wilkinson formula term sorting does not match MATLAB left-to-right hierarchy

1 participant