fixing polynomial and high-Q behavious#6
Open
till-schertenleib wants to merge 2 commits intodiffpy:mainfrom
Open
fixing polynomial and high-Q behavious#6till-schertenleib wants to merge 2 commits intodiffpy:mainfrom
till-schertenleib wants to merge 2 commits intodiffpy:mainfrom
Conversation
|
Yah, this seems pretty messy PR with lots of ipynb. We may want to take a step back and move things to |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
In getx we are fitting over the whole Q-range. I think we want to do the same. One problem I'm running into is that this makes the code choose an rpoly value that smooths S(Q) -->1 over the whole Q range. The code priorities smoothing low-Q ripples (structural signal) over smoothing S(Q) at high Q.
If we don't use a hiQ mast to fit the polynomial, we have to make the high-Q prioritization in some other way.
"Speed" problem: when defining poly_deg = (rpoly * qmax) / pi, poly_deg is almost never an integer. Rounding it to an integer would introduce abrupt changes in the PDF at the half-integer values. We would prefer the PDF to respond smoothly to the rpoly and Qmaxinst parameters. To simulate a polynomial fit at an arbitrary floating-point degree, the correction polynomial is therefore refined twice, for an integer floor and ceiling of nr, and the two fits are then averaged with the weights given by the distance of nr from its integer bounds.
This makes our code much more expensive, because rpoly (and therefore the weighing of the floor and ceiling) changes at each energy. We can't define a single average weighing for all energies, and instead have to fit a polynomial at each energy.
Do we use Qmaxinst or Qmax for defining poly_deg. getx uses the former, so we can do the same. What probably makes sense is to cut the F(Q) at a user-defined Qmax for the FT. --- or we could cut it at a point where F(Q) is close to 0?
Things to double-check by the math geeks:
_build_S_from_raw(): we solve coefficients for the polynomial using the weighted matrices that prioritize the high-Q range. We then evaluate the UNWEIGHTED matrix to apply the correction. So we use the coefficients solved on the weighted matrix to then calculate the actual y-values of the polynomial on the unweighted matrix. We have to do that because we don't want the resulting corrected S(Q) to have weird amplified signal at high-Q. Does this make sense?
_build_S_from_raw(): we define the y-data to wich the polynomial is fitted as "y = (1.0 - S0[i]) * (fmean_real[i]**2)". We multiply by fmean_real because we want to fit the polynomial to the raw intensities, not SQ. This is how it was done in the original pipeline, but I'm wondering why we aren't just taking the raw intensities directly. This seems to be unnecessary compute here.
_reconstruct_Salpha_Snot(): I removed a high-mask anchor in this function. Needs to be double checked. Also, there is a line marked with "!!!!" and it seems like there is some manual override of some sort being done. I have no idea what it's doing or why it's there. But this is in the part that Tieqiong is fixing I think.
reconstruct_partials_and_pdfs(): I removed hiQ masks here, and replaced the polydeg with rpoly. There are also some hiQ stats computed (probably for monitoring), and I just replaced the high-Q mask with some fixed number... It shouldn't influence the refinement.
RefinementConfig: replaced polydeg with rpoly and removed hiQ mask stuff. Also adjusted Qmin and Qmax. I added a stepsize of 0.05 for rpoly. We can decide if that's reasonable or not.
_peak_params(): I added rpoly. I think we just add rpoly to some dictionary here and make sure we can refine it in the next function
run_adpdf_with_refinements(): add rpoly stuff and removed mask.
in the cell where we actually run the pipline ("v3 runner") we set Qmax=50.0 for some reason. This does not affect the polynomial order, which defines Qmax as the maximum Q value in the Q-array, but it does affect how the Lorch functions in the S_to_Gr functions are defined.
We might have to tune the objective functions. I had to change the "smooth_lambda" value in the config file to something lower because otherwise, the code would sometimes just flatten out the whole S(Q) to reach a perfect score. This is something we need to discuss.