Conversation
Codecov Report❌ Patch coverage is
... and 2 files with indirect coverage changes 🚀 New features to boost your workflow:
|
|
Went a little crazy with this one after figuring out some of the rules needed modification for the Diagonal case where the output is by default the same as the input |
|
Your PR no longer requires formatting changes. Thank you for your contribution! |
|
This was a fun one, as a bunch of rules had to be modified to account for the case where one of the arguments |
lkdvos
left a comment
There was a problem hiding this comment.
Good catch, that does indeed sound like an absolute misery to debug/find and solve!
I left some small suggestions around code organization, mostly to reduce the amount of code and/or improve readability, but otherwise this looks great (I absolutely understand that the complexity might just be unavoidable, again, impressive catch on this!)
I'll go through the remainder afterwards if you like, but I will already leave this here to not get lost in a wall of questions :)
| if A_is_arg | ||
| ΔA = make_zero(A.dval) | ||
| $pb(ΔA, Aval, argval, darg) | ||
| A.dval .= ΔA |
There was a problem hiding this comment.
| A.dval .= ΔA | |
| A.dval .+= ΔA |
I'm guessing this has to be accumulating, but I'm not entirely sure I follow the logic of this branch, why can't we immediately in-place accumulate?
There was a problem hiding this comment.
We can't do it inplace because you'll modify dA which is the same as darg, I think. I think we should not do a + because A is overwritten in the primal step (but this is confusing, I'm open to being convinced otherwise)
There was a problem hiding this comment.
I think I'm following now.
Would it help if instead of allocating a new dA and then doing the copy afterwards, we instead copy the darg if we need to?
Also, does that mean that we could in principle be checking that dA is zero, since nothing is allowed to depend on the value of A after the decomposition?
Or would that be too strict of a check in the case where A is secretly not overwritten by the specific implementation of the decomposition.
There was a problem hiding this comment.
Let me think about this a bit... btw this whole hornet's nest would be much easier to handle if we finally implemented the qr_full!! stuff we had discussed at one point 😛
There was a problem hiding this comment.
I thought Jutho wasn't a big fan of that 🤔
There was a problem hiding this comment.
Then the hornets are sticking around!
| # use A (so that whoever does this is forced to handle caching A | ||
| # appropriately here) | ||
| Aval = nothing | ||
| A_is_arg = !isa(A, Const) && TA <: Diagonal && diagview(A.dval) === D.dval |
There was a problem hiding this comment.
Code-organization-wise, I am not a huge fan of this hardcoded special case for different types.
Why do you need the specific specialization here? Could it not just be A.dval === D.dval?
That being said, we should probably just add an implementation of isalias(A, arg).
Maybe something like Base.mightalias could be a more generic solution, or even a fallback definition, but that function is technically internal, and similar problems hold for checking Base.dataids.
I might actually be okay with depending on that though, that has been quite stable and does seem to me like a good way of going at this. ( and we already secretly use this in TensorOperations! )
There was a problem hiding this comment.
For the _vals methods we can't do A.dval === D.dval because D is a Vector, and A is a Diagonal, whose diag field points to D
There was a problem hiding this comment.
Ah you are right, my bad! I think then I would even more think an isalias(A, arg) = Base.mightalias(A, arg) is the right abstraction (I would leave in the hook since Base.mightalias is internal, and this allows us to overload without piracy)
There was a problem hiding this comment.
Yeah, that sounds good to me. We could do it as part of this PR or separately?
There was a problem hiding this comment.
There are a lot of goofy special cases running around here haha
There was a problem hiding this comment.
It might be reasonable to include it in this PR, as this does seem to be the primary change that is needed to solve the issue? Happy to defer it if you prefer though.
There was a problem hiding this comment.
TBH this PR is complex enough, I might do the isalias as a followup?
| if !(A === arg1 || A === arg2) | ||
| copy!(A, Ac) | ||
| $pb(dA, A, (arg1, arg2), (darg1, darg2)) | ||
| else | ||
| ΔA = zero(A) | ||
| $pb(ΔA, A, (arg1, arg2), (darg1, darg2)) | ||
| dA .= ΔA | ||
| end |
There was a problem hiding this comment.
| if !(A === arg1 || A === arg2) | |
| copy!(A, Ac) | |
| $pb(dA, A, (arg1, arg2), (darg1, darg2)) | |
| else | |
| ΔA = zero(A) | |
| $pb(ΔA, A, (arg1, arg2), (darg1, darg2)) | |
| dA .= ΔA | |
| end | |
| # compute pullback of A and restore input | |
| # order is important - A might alias `args`! | |
| $pb(dA, Ac, (arg1, arg2), (darg1, darg2)) | |
| copy!(A, Ac) | |
| # compute pullbacks and restore other inputs if they don't alias | |
| if A !== arg1 | |
| copy!(arg1, arg1c) | |
| zero!(darg1) | |
| end | |
| if A !== arg2 | |
| copy!(arg2, arg2c) | |
| zero!(darg2) | |
| end | |
Again mostly a readability suggestion. (Note that I might again be missing something about the use of dA vs Delta A here)
There was a problem hiding this comment.
See comment above about Delta A
There was a problem hiding this comment.
suggestion for the bottom part still stands though :)
There was a problem hiding this comment.
NB that if A !== arg, we MUST copy Ac -> A before the pullback
There was a problem hiding this comment.
Pushed a small cleanup here at least...
| elseif A_is_arg2 | ||
| make_zero!(arg.dval[1]) | ||
| end | ||
| A.dval === arg.dval[1] || make_zero!(arg.dval[1]) |
There was a problem hiding this comment.
I think this is actually not correct...
There was a problem hiding this comment.
Or at least seems to be causing test failures now?
|
CUDA fail is unrelated, it's the CUSOLVER memory thing again |
Fix stupid typos Diagonal + eig(h) working SVD working for diag LQ/QR working Polar and orthnull Some Mooncake progress Formatter Working Mooncake eig Remove comment No diag for orthnull Diag tests working for Mooncake Formatter Update ext/MatrixAlgebraKitEnzymeExt/MatrixAlgebraKitEnzymeExt.jl Co-authored-by: Lukas Devos <ldevos98@gmail.com> A bit of Enzyme cleanup A little Mooncake cleanup
No description provided.