Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,8 @@

5. Non-equi joins combining an equality condition with two inequality conditions on the same column (e.g., `on = .(id == id, val >= lo, val <= hi)`) no longer error, [#7641](https://github.com/Rdatatable/data.table/issues/7641). The internal `chmatchdup` remapping of duplicate `rightcols` was overwriting the original column indices, causing downstream code to reference non-existent columns. Thanks @tarun-t for the report and fix, and @aitap for the diagnosis.

6. By-reference sub-assignments of strings to factor columns now _actually_ match the levels in UTF-8 when required and now don't result in invalid factors being created, [#7648](https://github.com/Rdatatable/data.table/issues/7648), amending a previous incomplete fix to [#6886](https://github.com/Rdatatable/data.table/issues/6886) in v1.17.2. Thanks @BASS-JN for the report and @aitap for the fix.

### Notes

1. {data.table} now depends on R 3.5.0 (2018).
Expand Down
6 changes: 4 additions & 2 deletions inst/tests/tests.Rraw
Original file line number Diff line number Diff line change
Expand Up @@ -20671,9 +20671,11 @@ DT = data.table(factor(rep("\uf8", 3)))
# identical() to V1's only level but stored in a different CHARSXP
samelevel = iconv(levels(DT$V1), from = "UTF-8", to = "latin1")
DT[1, V1 := samelevel]
test(2311.1, nlevels(DT$V1), 1L) # used to be 2
# used to fail to look up the new level, resulting in an invalid factor, #7648
test(2311.1, as.integer(DT$V1), rep(1L, 3))
test(2311.2, nlevels(DT$V1), 1L) # used to be 2
DT[1, V1 := factor("a", levels = c("a", samelevel))]
test(2311.2, nlevels(DT$V1), 2L) # used to be 3
test(2311.3, nlevels(DT$V1), 2L) # used to be 3

# avoid translateChar*() in OpenMP threads, #6883
DF = list(rep(iconv("\uf8", from = "UTF-8", to = "latin1"), 2e6))
Expand Down
4 changes: 2 additions & 2 deletions src/assign.c
Original file line number Diff line number Diff line change
Expand Up @@ -806,9 +806,9 @@ const char *memrecycle(const SEXP target, const SEXP where, const int start, con
newSourceD[i] = val==NA_INTEGER ? NA_INTEGER : -hash_lookup(marks, sourceLevelsD[val-1], 0); // retains NA factor levels here via TL(NA_STRING); e.g. ordered factor
}
} else {
const SEXP *sourceD = STRING_PTR_RO(source);
for (int i=0; i<nSource; ++i) { // convert source integers to refer to target levels
const SEXP val = sourceD[i];
// for character input, sourceLevelsD corresponds to the source vector pre-converted to UTF-8
const SEXP val = sourceLevelsD[i];
newSourceD[i] = val==NA_STRING ? NA_INTEGER : -hash_lookup(marks, val, 0);
}
}
Expand Down
Loading