Skip to content

paumoal/CostCurves-R_package

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 

Repository files navigation

#                                CostCurves R-Package

***
__Author:__ Paulina Morillo

__e-mail:__ paumoal@inf.upv.es

******************************************************************

##Description

Tools for visualization and comparison of cost curves and the choice of threshold. Given predictions and class labels.
ROC curve can be plotted and TPR (True positives rate),  FPR (FP False positives rate) and AUCC (area under the curve costs) can also be computed.

Available in github: 
https://github.com/paumoal/CostCurves-R_package.git

***

##Installation

For linux download: "CostCurves_0.1.0.tar"


```{r}
#install.packages("./CostCurves_0.1.0.tar", repos = NULL)
```


For windows download: "CostCurves_0.1.0"
```{r}
#install.packages("./CostCurves_0.1.0")
```

***
##Get Started
### Enable Library
```{r}
library(CostCurves)
```
###Load data
```{r}
data("predictions")
print(predictions)

classes<-predictions$classes
A<- predictions$A
B<- predictions$B
C<- predictions$B
```
###Examples


####__Cost Lines__

```{r}

#names legend
CostLines(list(A, B), list(classes, classes), col=c("blue", "red"), lty=c(1, 2), namesClassifiers = c("A","B"))

#Loss by cost and LegendOFF
CostLines(list(A, B, rnorm(10)), list(classes), uniquec=TRUE, loss2skew = TRUE, lty=5)

```

####__Test Optimal treshold choice method__

```{r}
#Loss by cost
R<-TestOptimal(list(A, B), list(classes), uniquec=TRUE, pointsOFF = FALSE)

#Loss by skew
R<-TestOptimal(list(A, B), list(classes), uniquec=TRUE,	loss2skew = TRUE,  pointsOFF=FALSE)

```

####__Train Optimal treshold choice method__

```{r}
#Loss by cost
R<-TrainOptimal(list(A, B), list(classes), list(A, B), list(classes),  pointsOFF=FALSE, uniquect = TRUE, uniquecT = TRUE, loss2skew = TRUE, refuseT = TRUE)

#Loss by skew
R<-TrainOptimal(list(A, B), list(classes),	list(B, A), list(classes), uniquect = TRUE,	namesClassifiers=c("A", "B"), namesTests=c("B","A"), uniquecT = TRUE)

```

####__Brier Curves (score driven treshold choice method)__

```{r}
#Loss by cost
R<-BrierCurves(list(A, B), list(classes),	uniquec = TRUE, pointsOFF=FALSE)

#Loss by skew
R<-BrierCurves(list(A, B), list((1-classes), classes), loss2skew = TRUE, gridOFF = FALSE, pointsOFF=FALSE, main=NULL)

```

####__Kendall Curves (kendall treshold choice method)__

```{r}

#Loss by cost
R<-KendallCurves(list(A, B), list(classes), uniquec=TRUE,  pointsOFF=FALSE, main="Kendall Curves")

#Loss by skew
R<-KendallCurves(list(A, B), list(classes), uniquec=TRUE, loss2skew = TRUE, pointsOFF=FALSE)

```

####__Rate Driven Curves (Rate Driven treshold choice method)__

```{r}
#Loss by cost
R<-RateDrivenCurves(list(A, B, rnorm(10)),list(classes),  pointsOFF=FALSE, uniquec=TRUE)

#Loss by skew
R<-RateDrivenCurves(list((1-C), B, rnorm(10)), list(classes),  pointsOFF=FALSE, uniquec=TRUE,	loss2skew = TRUE)

```

####__Full Options to plotting Cost Curves (CostCurves function)__

```{r}
R<-CostCurves(list(A, B), list(classes), cost_lines = FALSE, uniquec = TRUE,	train_optimal = TRUE,	predictionsT =  list(B, A), classesT = list(classes), uniquecT = TRUE, rate_driven = TRUE)

R<-CostCurves(list(A, B), list(classes), cost_lines = FALSE, loss2skew = TRUE, uniquec = TRUE, train_optimal = TRUE, predictionsT =  list(B), classesT = list(classes), uniqueTrain = TRUE, kendall_curves = TRUE, rate_driven = TRUE)

```

####__ROC Curves__

```{r}
#Example1
R<-RocCurves(list((1-A), A, B), list(classes, classes, (1-classes)), gridOFF=FALSE, pointsOFF=FALSE)

#Example2
R<-RocCurves(list(B, A), list(classes), gridOFF=FALSE,
	positiveY = TRUE, uniquec = TRUE, ylab="TP", xlab="FP")


```
***

##Description of features

###Functions to plot Cost Curves
* CostLines
* TestOptimal
* TrainOptimal
* BrierCurves
* KendallCurves
* RateDrivenCurves
* CostCurves
* RocCurves

###Graphics Options settings
Parameter          | Description
-------------------|------------------------
hold               | The view is maintained to plot a new curve above the current curve.
plotOFF            | Allow turn off plot.
gridOFF            | Allow turn off grid.
pointsOFF          | Allow turn off point marks.
legendOFF          | Allow turn off legend.
main               | It is used to specify a different title to default title.
xlab               | It is used to specify a different x label to default x label.
ylab               | It is used to specify a different y label to default y label.
namesClassifiers   | An array with names of each classifier.
lwd                | Line width. (array or single value).
lty                | Line type. (array or single value).
col                | Line color. (array or list of arrays).
pch                | Point type. (array or single value).
cex                | Size point. (array or single value).
xPosLegend         | x coordinate to be used to position the legend.
yPosLegend         | y coordinate to be used to position the legend.
cexL               | Size Lengend.

###General Options settings
Enabled for all graphical functions:

Parameter          | Description
-------------------|------------------------
predictions        | A list of predictions or scores arrays.
classes            | A list of classes arrays, each array represent binary classes.
loss2Skew          | If it is TRUE, loss by Skew is plotted otherwise loss by cost is plotted.
uniquec            | If it is TRUE, the same array classes is used for each array in a list predictions.

###Specific Options settings
* Using only in CostCurves function:

Parameter          | Description
-------------------|------------------------
cost-lines         | To plot cost curves.
test_optimal       | To plot cost curves using test optimal treshold choice method.
train_optimal   | To plot cost curves using train optimal treshold choice method. 
predictionsT       | A list of classes arrays, each array represent predictions or scores of train classifier. -Only for train optimal option-
classesT           | A list of classes arrays, each array represent binary classes train. -Only for train optimal option-
uniquecT           | If it is TRUE, the same array classes is used for each array in a list predictions train. -Only for train optimal option-
uniqueTrain     | If it is TRUE, the same array of predictionsT and classesT is used for each array in a list predictions. -Only for train optimal- option_
score_driven       | To plot cost curves  using score driven treshold choice method.
rate_driven        | To plot cost curves using rate driven treshold choice method.
kendall_curves     | To plot cost curves using kendall driven treshold choice method.

* Using only in TrainOptimal function:

Parameter               | Description
------------------------|------------------------
predictions_train   | A list of predictions arrays, each array represent scores of a specific classifier using train.
classes_train        | A list of classes arrays, each array represent binary classes of train.
predictions_test        | A list of predictions arrays, each array represent scores test.
classes_test            | A list of classes arrays, each array represent binary classes test.
uniquecT                | If it is TRUE, the same array classes is used for each array in a list predictions train.
uniquect                | If it is TRUE, the same array classes is used for each array in a list predictions test.
refuseT	                | It is possible to use the same train classifier for every test.

***

##References:

* Drummond, C., & Holte, R. C. (2006). Cost curves: An improved method for visualizing classifier performance.

* Fawcett, T. (2006). An introduction to ROC analysis. Pattern recognition letters, 27(8), 861-874.

* Ferri, C., Hernandez-orallo, J., & Flach, P. A. (2011). Brier curves: a new cost-based visualisation of classifier performance. In Proceedings of the 28th International Conference on Machine Learning (ICML-11) (pp. 585-592).

* Hernandez-Orallo, J., Flach, P., & Ferri, C. (2013). ROC curves in cost space. Machine learning, 93(1), 71-91.

***

About

Tools for visualization and comparison of cost curves and the choice of threshold. Given predictions and class labels . ROC curve can be plotted and TPR (True positives rate), FPR (FP False positives rate) and AUCC (area under the curve costs) can also be computed.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages