Skip to content

Python bindings for sparcehc distance matrix clustering algorithm

License

Notifications You must be signed in to change notification settings

mdimura/sparsehc-dm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Sparsehc-dm is a python wrapper for sparcehc distance matrix clustering algorithm, integrated with STXXLDOI for Citing STXXL for on-disk sorting. SparseHC DOI for Citing SparseHC is a memory-efficient hierarchical agglomerative clustering implementation. It has close to linear memory complexity, enabling clustering of ~900000 structures/points on 32GB RAM.

Usage example:

import mdtraj as md
from sparsehc_dm import sparsehc_dm

traj_filename='traj.nc'
top_filename='top.pdb'

traj=md.load(traj_filename,top=top_filename)

m=sparsehc_dm.InMatrix()
N=traj.n_frames
for i in range(0,Nframes-1):
  rmsds=md.rmsd(traj, traj, i)
  for j in range(i+1,Nframes):
    m.push(i,j,float(rmsds[j]))

Z=sparsehc_dm.linkage(m,"complete")

Instalation

Prerequisites: boost graph and stxxl library

sudo apt-get install libboost-graph-dev libstxxl-dev libstxxl1

Building:

git clone https://github.com/Burning-Daylight/sparsehc-dm.git sparsehc-dm
cd sparsehc-dm
mkdir build
cd build
cmake ..
make
sudo make install

About

Python bindings for sparcehc distance matrix clustering algorithm

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors