-
Notifications
You must be signed in to change notification settings - Fork 4
CID mode segfaults during TBR after cpp-search merge #228
Description
Summary
After merging cpp-search into feature/cid-consensus (commit 7cce43652), CID consensus search segfaults (exit code 139) during the TBR phase of driven_search(). Reproducible with as few as 2 input trees and 12 tips.
Reproducer
library(TreeSearch)
library(TreeTools)
data(inapplicable.phyData)
dat <- inapplicable.phyData[["Vinther2008"]]
trees <- MaximizeParsimony(dat, maxReplicates = 2, maxSeconds = 3, verbosity = 0)
CIDConsensus(trees, maxReplicates = 1, maxSeconds = 5, verbosity = 2, nThreads = 1)
# Output:
# Replicate 1/1
# wag_rand+NNI tree score: -10.261 [0 ms]
# [exit code 139]Root cause (partial)
Two issues were found and fixed in commit fc19747d7:
-
RcppExports.cpp truncation: Git auto-merge inserted
ts_wagner_bias_benchandts_test_strategy_trackerinside thets_cid_consensusfunction body (betweenRcpp::wrap()and the missingreturn/END_RCPP/}). Fixed by restoring the closing lines. -
NNI warmup segfault:
nni_first = true(new default from cpp-search) causednni_search()to run on CID datasets.nni_search()callsscore_tree()which for CID returns the CID score without populating Fitch state arrays, then usesfitch_incremental_downpass()on uninitialisedprelim/local_costarrays. Fixed by adding&& (ds.scoring_mode != ScoringMode::CID)to thenni_wagnerguard ints_driven.cpp.
Remaining crash
After both fixes, the segfault persists. The crash occurs after Wagner tree construction succeeds (a score is printed) but before TBR reports its score. The TBR entry point calls full_rescore(tree, ds) which does:
tree.reset_states(ds)— clears arrays, reloads tip statesscore_tree(tree, ds)— for CID, callscid_score()→compute_splits_cid()fitch_score(tree, ds)— populates Fitch state arrays for MRP screening
The crash is in one of these three steps. cid_score() accesses cd.cand_tip_bits, cd.cand_buf, and cd.lap_scratch, all allocated by prepare_cid_data(). The MRP DataSet from build_mrp_dataset() is structurally valid (Wagner + NNI succeed on it).
Investigation notes
- The CID tests passed on the previous merge base (
f97824eaa), so this is a regression from the cpp-search merge. - Only 2 cpp-search commits touched
ts_tbr.cppsince the last merge:ae5431fa9(removed dead function) and62658709d(constraint post-hoc validation) — neither should affect CID. - Attempted to add
Rprintfdebug prints but Windows sed/awk writes literal newlines instead of\nin C strings, and the R session has TreeSearch DLL loaded causing Rscript crashes. A debug build with working prints would pinpoint the exact failing line. - The file has CRLF line endings (Windows worktree) vs LF on the branch — this caused
diffto show the entire file as changed but is cosmetic.
Suggested next steps
- Build a debug version with
Rprintfmarkers aroundreset_states,cid_score, andfitch_scoreinsidefull_rescore()(ints_tbr.cppline 36) to pinpoint which call segfaults. - Check whether
compute_splits_cid()is accessing out-of-bounds memory (it usestree.postorder,tree.left[ni],tree.right[ni]— all should be valid post-Wagner). - Check
lap_scratchsizing:ensure(cd.max_splits)sizes for max input-tree splits, but the candidate tree may produce more splits than any input tree.