unique(DT) when there are no dups could be much faster

(The new default of using all columns brings this to the fore.)

```
DT = data.table(A=1:3, B=4:6)
DT
   A B
1: 1 4
2: 2 5
3: 3 6
debug(duplicated.data.table)
debug(unique.data.table)
unique(DT)

/duplicated.R#22:
Browse[3]> o
integer(0)
attr(,"starts")
[1] 1 2 3
attr(,"maxgrpn")
[1] 1
```
So at this point it knows that DT is unique and it could return it or a shallow copy straight away.  But it doesn't.  It carries on to turn all-FALSE into 1:nrow and then subset every column by that 1:nrow.

Also should time the forderv to make sure it is short-circuiting correctly once it resolves ambiguities in the first few columns.  forderv should not touch B in this example at all because A is enough to reach uniqueness.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

unique(DT) when there are no dups could be much faster #2013

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

unique(DT) when there are no dups could be much faster #2013

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions