Skip to content

Slow .SD[.N] compare to last(.SD) with groupby #4809

@matthewgson

Description

@matthewgson

I noticed there is significant speed difference between
DT[,.SD[.N], by=col] and
DT[,.last(.SD), by=col]

for relatively large data (10 * 1.8M rows).

image

DT[,.SD[1],by=col] and DT[,first(.SD), by=col] did not show difference in performance, however.

image

Metadata

Metadata

Assignees

No one assigned

    Labels

    GForceissues relating to optimized grouping calculations (GForce)

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions