Need to have a faster implementation of getting deviances for splicing and stability events/features

## Main
On the current pipeline, we use a `glm` fitting with a `binomial` family to model m1 as success and m2 as failure. Here is the original code we used previously.
```r
# the glm function 
.glm_fit <- function(nz_indx, nz_val) {  
  
  temp_m1_m2 <- rep(0, ncol(temp_X))
  temp_m1_m2[nz_indx] <- nz_val
  
  #fit <- glm(cbind(temp_m1_m2[m1_indexes], temp_m1_m2[m2_indexes]) ~ 1, family = "binomial")
  fit <- tryCatch({
    glm(cbind(temp_m1_m2[m1_indexes], temp_m1_m2[m2_indexes]) ~ 1, family = "binomial")
  }, error = function(e) {
    message("Error in glm: ", e$message, ", due to not convargence, we put zero as deviance for this case.")
    return(NULL)
  }) 
  if (is.null(fit)) {
    return(0)  # Return NA if glm failed
  }
  return(fit$deviance)
 }
```

## To do
1. For now, since we extract `fit$deviance`, need to do is to implement the deviance in a C++ script and benchmark with the previous results.
2. Then, create an R script and matching cpp function using `Rcpp` to make getting the high variable events easier. I suggest a function for splicing and stability.
3. (Optional) If the implementation is successful, we can try to build our native feature selection for gene expression and not use Seuart's `Find VariableFeatures` by using the `vst` method.

Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Need to have a faster implementation of getting deviances for splicing and stability events/features #2

Main

To do

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Need to have a faster implementation of getting deviances for splicing and stability events/features #2

Description

Main

To do

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions