Huge (High-Dimensional Undirected Graph Estimation) is a general project for sparse graphical model estimation and inference in high dimensions. The core algorithm is implemented in C++ with Rcpp for portable high performance linear algebra.
This repository provides two package variants:
- R package:
huge(native R interface, available on CRAN) - Python package:
pyhuge(native Python interface with shared C++ core)
Both variants share the same C++ core and target the same modeling pipeline, including graph estimation, model selection, and inferential analysis.
- R version (
huge): see sections below for prerequisites and installation. - Python version (
pyhuge): see Python Package (pyhuge).
Huge uses OpenMP to enables faster matrix multiplication. So, to use huge, you must correctly enables OpenMP for the compiler.
For Windows and Linux users, newest version of GCC has fully support of OpenMP.
But for MAC OS users, things are a little tricky since the default llvm on MAC OS does not support OpenMP. But the solution is easy. You can simply install llvm with full OpenMP support and direct R using this version of llvm.
First, install llvm with OpenMP support by typing
brew install llvm
Then append the following lines into ~/.R/Makevars to enable llvm with OpenMP support to be the compiler for R packages.
CC = /usr/local/bin/clang-omp
CXX = /usr/local/bin/clang-omp++
CXX98 = /usr/local/bin/clang-omp++
CXX11 = /usr/local/bin/clang-omp++
CXX14 = /usr/local/bin/clang-omp++
CXX17 = /usr/local/bin/clang-omp++
OBJC = /usr/local/bin/clang-omp
OBJCXX = /usr/local/bin/clang-omp++
First, you need to install the devtools package. You can do this from CRAN. Invoke R and then type
install.packages(devtools)
Then load the devtools package and install huge
library(devtools)
install_github("Gatech-Flash/huge")
library(huge)
Windows User: If you encounter a Rtools version issue: 1. make sure you install the latest Rtools; 2. try the following code
assignInNamespace("version_info", c(devtools:::version_info, list("3.5" = list(version_min = "3.3.0", version_max = "99.99.99", path = "bin"))), "devtools")Ideally you can just install and enable huge using with the help of CRAN on an R console.
install.packages("huge")
library(huge)
This repository includes a native Python package under python-package/.
It shares the same C++ core as the R package for portable high performance.
python-package/README.mdpython-package/docs/python-package/examples/
git clone https://github.com/Gatech-Flash/huge.git
cd huge/python-package
pip install -e .
python -c "import pyhuge; print(pyhuge.test())"Optional extras:
pip install -e ".[viz]" # matplotlib + networkx
pip install -e ".[dev]" # tests + docs + release tooling- Docs site: https://tourzhao.github.io/huge/
- Python tests workflow:
.github/workflows/python-wrapper-tests.yml - Python docs workflow:
.github/workflows/python-package-docs.yml - Python release workflow:
.github/workflows/python-package-release.yml
- Estimation:
huge,huge_mb,huge_glasso,huge_ct,huge_tiger - Selection/preprocessing:
huge_select,huge_npn - Simulation/inference/ROC:
huge_generator,huge_inference,huge_roc - Utility/plots:
huge_summary,huge_select_summary,huge_plot_*,huge_plot_network
#generate data
L = huge.generator(n = 50, d = 12, graph = "hub", g = 4)
#graph path estimation using glasso
est = huge(L$data, method = "glasso")
plot(est)
#inference of Gaussian graphical model at 0.05 significance level
T = est$icov[[10]]
inf = huge.inference(L$data, T, L$theta)
print(inf$error) # print out type-I errorFor detailed implementation of the experiments, please refer to benchmark/benchmark.R
We compared our package on hub graph with (n=200,d=200) with other packages, namely, QUIC and clime. Huge significantly outperforms clime, QUIC and original huge in timing performance. We also calculated the likelihood for estimation.
| CPU Times(s) | |
|---|---|
| Huge glasso | 1.12 |
| Huge tiger | 1.88 |
| Huge (CRAN 1.2.7) | 1.80 |
| QUIC | 7.50 |
| Clime | 416.77 |
| Object value | |
|---|---|
| Huge glasso | -125.96 |
| Huge tiger | -125.47 |
| QUIC | -90.58 |
| Clime | -136.96 |
When using the Gaussian graphical model, huge controls the type I error well.
| band | hub | scale-free | ||||
| significance level | 0.05 | 0.10 | 0.05 | 0.10 | 0.05 | 0.10 |
| type I error | 0.0175 | 0.0391 | 0.0347 | 0.0669 | 0.0485 | 0.0854 |
[1] T. Zhao and H. Liu, The huge Package for High-dimensional Undirected Graph Estimation in R, 2012
[2] Xingguo Li, Jason Ge, Haoming Jiang, Mingyi Hong, Mengdi Wang, and Tuo Zhao, Boosting Pathwise Coordinate Optimization: Sequential Screening and Proximal Subsampled Newton Subroutine, 2016
[3] Quanquan Gu, Yuan Cao, et al. Local and Global Inference for High Dimensional Nonparanormal Graphical Models
[4] Confidence intervals for high-dimensional inverse covariance estimation
[5] D. Witten and J. Friedman, New insights and faster computations for the graphical lasso,2011
[6] N. Meinshausen and P. Buhlmann, High-dimensional Graphs and Variable Selection with the Lasso, 2006