Skip to content

Consider zero-length vector for := no recycling #3386

@renkun-ken

Description

@renkun-ken

I believe it is a reasonable change that := no longer recycles length>1 RHS vectors (#3310).

The following cases are some usual usage that may be broken:

For example, suppose we have a data.table of quarterly prices in each year of each symbol. Now create a column of price of the fourth quarter of each year:

dt[, last_quarter_price := price[quarter == 4L], by = .(symbol, year)]

For those stocks which delisted for some reason, there might be no data for the fourth quarter of the last year so that price[quarter == 4L] may result in a zero-length numeric vector. With the newer (restrict) recycling behavior, this will end up in an error.

To handle it, I have to change the code into the following

dt[, last_quarter_price := price[quarter == 4L][1L], by = .(symbol, year)]

A similar use case is as follows:

dt[, first_price := first(price[volume > 0]), by = symbol]

For some reason, price[volume > 0] may be a zero-length numeric vector, and first(<zero-length vector>) also returns a zero-length vector. In this case, first_price should get an NA, so I have to change the code into the following to achieve this:

dt[, first_price := price[volume > 0][1L], by = symbol]

In both cases, the length of the resulted vector must be zero or one. One is consistently recycled like before but the zero-length cases are broken. I'm not sure if it makes sense that := <zero-length vector> automatically gets a missing value, or otherwise, I need to rework all such cases so that they get NA like before.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions