Replace `dplyr` grouping with fast data.table implementation, with optional strand grouping

## Description
We need to replace the **`dplyr`**-based code for grouping and mutation with a faster **`data.table`** version.  
The goal is to replicate the behavior of the following `dplyr` pipeline:

```r
temp_eventdata_grouped <- temp_eventdata %>% 
  group_by(start_cor_id) %>% 
  mutate(start_cor_group_id = cur_group_id()) %>% 
  mutate(start_cor_group_count = n()) %>% 
  ungroup() %>% 
  group_by(end_cor_id) %>% 
  mutate(end_cor_group_id = cur_group_id()) %>% 
  mutate(end_cor_group_count = n()) %>%
  as.data.table()
```

### Additional functionality:
- Add **optional support** for grouping by **DNA strand** (`strand` column) alongside `start_cor_id` and `end_cor_id` when specified by the user.
- Ensure that if the strand information is used, the groupings respect both ID and strand combinations.

## Requirements
- Rewrite the above pipeline entirely using `data.table` for better performance.
- Keep the output structure compatible with downstream code (i.e., a `data.table` with the same new columns).
- Add an argument like `use_strand = TRUE/FALSE` to toggle whether the DNA strand should be included in grouping keys.
- Preserve the same column names:
  - `start_cor_group_id`
  - `start_cor_group_count`
  - `end_cor_group_id`
  - `end_cor_group_count`
- Avoid unnecessary copying of the data.
- Ensure the behavior when `strand` column is missing is either:
  - Graceful fallback (ignore strand), or
  - Explicit error (clear messaging).

## Deliverables
- `group_eventdata_datatable.R` (main function or utility script)
- Updated documentation/comments explaining the difference between strand-aware and strand-agnostic grouping
- Unit tests to check correctness of both `use_strand = TRUE` and `use_strand = FALSE` modes
- Benchmarks comparing runtime between `dplyr` and `data.table` versions (optional)

### Notes
- When `use_strand = TRUE`, grouping keys will be:
  - `start_cor_id + strand` for `start_cor_group_id`
  - `end_cor_id + strand` for `end_cor_group_id`
- When `use_strand = FALSE`, grouping will be only by `start_cor_id` or `end_cor_id`.
- `data.table` usage should leverage `.GRP` and `.N` efficiently:
  - `.GRP` gives the current group number
  - `.N` gives the size of the current group

### Priorities
- **High**: Correctness of grouping logic, especially with optional `strand` handling.
- **High**: Full replacement of `dplyr` with `data.table` to improve speed.
- **Medium**: Defensive coding for missing or incorrectly typed `strand` column.
- **Low**: Benchmarks and performance comparison.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace `dplyr` grouping with fast data.table implementation, with optional strand grouping #5

Description

Additional functionality:

Requirements

Deliverables

Notes

Priorities

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Replace dplyr grouping with fast data.table implementation, with optional strand grouping #5

Description

Description

Additional functionality:

Requirements

Deliverables

Notes

Priorities

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Replace `dplyr` grouping with fast data.table implementation, with optional strand grouping #5