Introduce datatable.unique.names policy for duplicate handling in setnames() #4044 by venom1204 · Pull Request #7647 · Rdatatable/data.table

venom1204 · 2026-02-25T08:55:39Z

closes #4044
This PR introduces a configurable policy for handling duplicate column names created by setnames().

Changes introduced:

Added a new global option datatable.unique.names (default: "off") to preserve backward compatibility.

Supported policies:

"off" – Allow duplicates silently (current behavior).
"warn" – Issue a warning when duplicates are created.
"error" – Stop execution if duplicates would be created.
"rename" – Automatically enforce uniqueness using make.unique().

Added a centralized helper process_name_policy() in utils.R to handle duplicate detection and enforcement.

Integrated the policy check into setnames() before reference updates to ensure keys and indices are not corrupted in "error" or "rename" modes.

Added validation for invalid option values with a warning and safe fallback to "off".
The default behavior remains unchanged, and performance is preserved in the "off" fast path.

hi @ben-schwen , when you have time could you please take a look?
thanks.

github-actions · 2026-02-25T09:17:46Z

No obvious timing issues in HEAD=issuenight

Generated via commit 88085d1

Download link for the artifact containing the test results: ↓ atime-results.zip

Task	Duration
R setup and installing dependencies	5 minutes and 44 seconds
Installing different package versions	11 minutes and 46 seconds
Running and plotting the test cases	4 minutes and 18 seconds

ben-schwen · 2026-03-04T21:06:20Z

R/data.table.R

    if (!length(new)) return(invisible(x)) # no changes
    if (length(i) != length(new)) internal_error("length(i)!=length(new)") # nocov
  }
-  # update the key if the column name being change is in the key


please dont delete comments

ben-schwen · 2026-03-04T21:07:48Z

man/data.table-options.Rd

    \item{\code{datatable.enlist}}{Experimental feature. Default is \code{NULL}. If set to a function
      (e.g., \code{list}), the \code{j} expression can return a \code{list}, which will then
      be "enlisted" into columns in the result.}
+    \item{\code{datatable.unique.names}}{A character string, default \code{"off"}. 


it should somehow state that this currently only holds for setnames and not other functions that could create duplicate names like merge, cbind, ...

ben-schwen · 2026-03-04T21:08:02Z

inst/tests/tests.Rraw

+options(datatable.unique.names = "error")
+test(2366.3, setnames(copy(DT), "Petal.Length", "Sepal.Length"), error = "Duplicate column names created")
+options(datatable.unique.names = "rename")
+test(2366.4, names(setnames(copy(DT), "Petal.Length", "Sepal.Length")), c("Sepal.Length", "Sepal.Width", "Sepal.Length.1", "Petal.Width", "Species"))


missing newline

ben-schwen · 2026-03-04T21:10:12Z

R/utils.R

+
+  if (anyDuplicated(names_vec)) {
+    dups = unique(names_vec[duplicated(names_vec)])
+    msg = sprintf("Duplicate column names created: %s. This may cause ambiguity.", brackify(dups))


Is this problematic for a table like dt = data.table('%s'=1, b=2)?

ben-schwen · 2026-03-04T21:11:07Z

R/utils.R

+process_name_policy = function(names_vec) {
+  policy = getOption("datatable.unique.names", "off")
+
+  if (is.null(policy) || policy == "off") return(names_vec)


on a second thought NULL might be a better default for "off"

# onLoad.R datatable.unique.names = NULL # NULL means off

ben-schwen · 2026-03-04T21:12:25Z

inst/tests/tests.Rraw

+#4044
+DT = as.data.table(iris)
+options(datatable.unique.names = "off")
+test(2366.1, names(setnames(copy(DT), "Petal.Length", "Sepal.Length")), c("Sepal.Length", "Sepal.Width", "Sepal.Length", "Petal.Width", "Species"))


we now the nice options parameter in test() so it would be good to use it!

ben-schwen · 2026-03-04T21:14:56Z

There are also some linters issues which need to be taken care of.

We also have check_duplicate_names in utils.R so it might need be worth checking if we can unify both into a single function...

venom1204 added 2 commits February 16, 2026 01:37

changes applied

ca5b8c9

added news and doc

8699316

venom1204 requested a review from MichaelChirico as a code owner February 25, 2026 08:55

Merge branch 'master' into issuenight

88085d1

ben-schwen reviewed Mar 4, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce datatable.unique.names policy for duplicate handling in setnames() #4044#7647

Introduce datatable.unique.names policy for duplicate handling in setnames() #4044#7647
venom1204 wants to merge 3 commits intomasterfrom
issuenight

venom1204 commented Feb 25, 2026

Uh oh!

github-actions bot commented Feb 25, 2026 •

edited

Loading

Uh oh!

ben-schwen Mar 4, 2026

Uh oh!

ben-schwen Mar 4, 2026

Uh oh!

ben-schwen Mar 4, 2026

Uh oh!

ben-schwen Mar 4, 2026

Uh oh!

ben-schwen Mar 4, 2026

Uh oh!

ben-schwen Mar 4, 2026

Uh oh!

ben-schwen commented Mar 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

venom1204 commented Feb 25, 2026

Uh oh!

github-actions bot commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ben-schwen Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

ben-schwen Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

ben-schwen Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

ben-schwen Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

ben-schwen Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

ben-schwen Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

ben-schwen commented Mar 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions bot commented Feb 25, 2026 •

edited

Loading