Skip to content

xc308/Highly_Multivariate_Large_Scale

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

867 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Highly Multivariate Large-scale Spatial Stochastic Processes -- A Cross-Markov Random Field Approach

This repo contains the code for the manuscript, entitled "Highly Multivariate Large-scale Spatial Stochastic Processes -- A Cross-Markov Random Field Approach", by Xiaoqing Chen, Peter Diggle, James.V.Zideck, Gavin Shaddick.

We propose a cross-MRF model class, consisting of a mixed spatial graphical model framework and a cross-MRF theory to address various challenges of highly multivariate large-scale spatial data collectively within one unified framework.

The core contribution of the cross-MRF theory is that it realises doubly conditional independence (CI) among both p components and n spatial locations, see here.

We achieved:

  • utmost sparsity in the joint precision matrix
  • lowest generation order of the joint precision matrix
  • asymmetric cross-correlation in the joint covariance matrix
  • scientific interpretability

[Comparative study results] [Models comparison] [Asymmetry and sparsity]

Scripts Contents

  • 000: auto-correlation matrix and cross-correlation plots

  • Figure folder: Sigma, Sigma_inv plots, p = 10, CI among p only; CAMS denoising plot

  • 032c: Tst9c, 1D SG and SG_inv construction, Matern, CI among p only (non-cross-MRF), SpN + Reg, thres = 1e-3, reg_num = 1e-9 (Test construction joint Sigma and Sigma_inv; p = 6, n = 40; p = 10, n = 400, 600, 800; exact zero percentage CI among p only)

  • 032c_NEW: modified while loop to remove pd test every construction step. (System time of construction joint Sigma and Sigma_inv; p = 10, n = 600)

  • 032d: 1D simulation plots functions for non-cross-MRF, C.I. among p only; (Plot joint Sigma and Sigma_inv; p = 6, n = 40)

  • 032e: 1D SG, SG_inv construction, CI among p only (p = 10, n = 40, 400, 600, 800; elapsed system wall time, CI among p only)

  • 032f: SG, SG_inv plots (p = 10)

  • 034b: SG_inv construction, sparse percentage comparison among cross-MRF and non-cross-MRF for Tri-Wave and Wendland; (Percentage of exact-zero entries; elapsed system wall time cross-MRF)

  • 034b_NEW: modified while loop to remove pd test every construction step. (Test construction joint Sigma and Sigma_inv; p = 10, n = 600)

  • 034c: Tst10c, 1D SG_inv construction, cross-MRF, with SpNorm + Reg for b function, b can be chosen; (Joint Sigma_inv plots, p = 6)

  • 034e: CI among n only, Mardia 1988; p =10, n = 400, 600, 800, 1000 (Exact-zero percentage; fully-connected graph reach memory limit; System time of construction joint Sigma and Sigma_inv; p = 10, n = 600)

  • 037: 100 randomly evaluated Sigma_inv generation microbenchmark; (Sigma and Sigma_inv generation time)

  • 046b: generate 1D true processes and noisy data, Tri-Wave and Wendland

  • 046c: generate 2D true processes and noisy data, Tri-Wave and Wendland - consistent relationship between sparsity in uni-SG_inv and joint SG_inv

  • 047b: optimization, Tri-Wave, Tst10c (cross-MRF)

  • 047c: optimization, Wendland, Tst10c (cross-MRF)

  • 048b: co-krig, Tri-Wave, 1 fold C.V. results

  • 048d: co-krig, Wendland, 1 fold C.V. results

  • 049: neg_logL function of non-cross-MRF, TST9d

  • 049b: optimization using 049, Tri-Wave, Wendland

  • 055: 2D inference (neg_logL_2D, optim) for 6 fields in Fig12, Tri-Wave (converged), Wendland (converged)

  • 056: 2D cokrig (pure denoising)

  • 057: Data processing, generate df_Res_log_16_sorted, sorted by Lon (asc), then by Lat (desc); 4 Lon strips

  • 059: TST12 GPU version

  • 060: GPU parallel + optim on 1 CPU

  • 061: GPU parallel + optim on 4 CPUs

  • 062: pure optim parallel on 51 CPUs, no GPU parallelisation

  • 063: CAMS data processing

  • 064a: CAMS data with 060, GPU parallel + optim on 1 CPU, Lon_Strip_1

  • 064b: CAMS data with 060, GPU parallel + optim on 1 CPU, Lon_Strip_4

  • 065a: CAMS one complete construction time for SG, and SG_inv, with GPU off-loading, df_Lon_Strip_1_Sort_new.rds;(real-world data illustration)

  • 065b: CAMS data denoising

  • 065c: Plot of CAMS 5

  • 066a: CAMS one complete construction time for SG, and SG_inv, solo CPU, df_Lon_Strip_1_Sort_new.rds;(real-world data illustration)

Acknowledgements

  • Iain Steison recommended using optimParallel() for parallel L-BFGS-B optimization on the CPU.
  • David Llewellyn-Jones helped set up the HPC resource and answered lots of elementary questions regarding Baskerville HPC.
  • Ryan Chan reminded XC that traditional R code will not automatically utilize GPU resources even when run on HPC.

About

Official Code for Highly Multivariate Large-scale Spatial Stochastic Processes --- A Cross-MRF Model

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages