Skip to content

jcheminform/CrystalCaps

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

115 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CGN-e3: Capsule Graph Networks for Accurate and Interpretable Crystalline Materials Property Predictions

This repository presents the implementation of Equivariant Capsule Graph Networks (CGN-e3), a novel architecture that integrates capsule networks with graph neural networks for crystalline materials representation. The CGN-e3 model processes atomic graphs by encoding neighbor distances via radial basis functions, angles via spherical harmonics, and aggregates messages via Clebsch–Gordan tensor products, satisfying equivariance under 3D reflections, rotations, and translations.

image

The model processes a crystal’s atomic graph through a sequence of equivariant graph convolutions, capsule routing layers and an attention mechanism to predict material properties. First, atomic numbers are embedded via one-hot and distances are encoded by a radial basis function (RBF) expansion. In each equivariant convolutional layer, for each central atom and neighbor the relative vector is computed and decomposed it into its length and direction. The distance is expanded into a vector of Gaussian RBFs and direction expanded in spherical harmonics. A learnable multilayer perceptron (MLP) is applied to the RBF vector to produce radial filter coefficients, which are then multiplied with the spherical harmonics and tensor-producted with the neighbor’s feature tensor. Clebsch–Gordan tensor product aggregates spherical-harmonic order with neighbor features to produce output of order. The summed messages are then assembled back into irreducible feature vectors for atom. Each layer output is passed through suitable nonlinearities; that is SiLU on scalars and gated-sigmoid on vectors and an equivariant normalization for normalization and rescaling. After convolutions, each atom has a tuple of scalar and vector features encoding local geometry and chemistry. We treat these as primary capsules at the node level. These primary node capsules are then aggregated into a smaller set of graph capsules via an attention-and-routing mechanism. An attention weights nodes to balance different graph sizes, and then dynamic routing iteratively refines coupling coefficients between each node’s capsule and each graph-level capsule. Finally, to produce the property prediction, the graph capsule outputs are either further routed with attention into final output capsules corresponding to target properties.

image

Salency maps for interpretation

image

## Quick Start
# Clone the repository
git clone https://github.com/Eddah-Sure/CrystalCaps.git
cd CrystalCaps

# Install dependencies
pip install -r requirements.txt

# Run training (from repo root)
python -m src.crystalcaps.Train

Requirements

Requirements

  • Python 3.8+
  • PyTorch 1.12+
  • PyTorch Geometric
  • e3nn
  • NumPy
  • Pandas
  • Scikit-learn

Installation

# Install PyTorch with CUDA support (recommended)
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

# Install all other dependencies
pip install -r requirements.txt

For CPU-only installation:

pip install -r requirements.txt

Data Preparation

Dataset Structure

The model requires three main files for each dataset:

  1. targets.csv: Contains target values (e.g., formation energy, band gap)

    • Must include columns: mpid (material ID) and the target property from the Material Project Database
  2. graph_data.npz: Contains crystal graph attributes

    • Generated using the provided graph coordinator ( check the our Graph Coordinator )
  3. config.json: Defines node feature vectors

    • Contains atomic numbers and their corresponding feature vectors, idealy we generate this fom our graph coordinator.

Data Sources

  • Material IDs are provided in the data/ directory
  • Use the graph coordinator in Materials Project/Graph coordinator.py to generate graph files
  • API Key Required: Get the Materials Project API key here

Dataset Structure

dataset/
├── targets.csv          # Target properties
├── graph_data.npz       # Graph representations
└── config.json          # Node feature definitions

Example

if __name__ == '__main__':
    
    data_dir = "data/"
    targets_csv = "data/targets.csv"
    config_json = "data/config.json"
    graph_npz = "data/graph_data.npz"
    target_property = "formation_energy_per_atom"
   
    results_crystalcaps = train_and_evaluate_crystalcaps(
        data_dir, targets_csv, config_json, graph_npz, target_property)
   

Hyperparameters

Parameter Default Description
batch_size 64 Training batch size
hidden_channels 128 Hidden layer dimensions
num_conv_layers 2 Number of graph convolution layers
primary_caps 8 Number of primary capsules
primary_dim 16 Primary capsule dimensions
secondary_caps 6 Number of secondary capsules
secondary_dim 16 Secondary capsule dimensions
dropout_rate 0.1 Dropout probability
early_stopping_patience 20 Early stopping patience

Project Structure

CrystalCaps/
├── src/
│   ├── crystalcaps/
│   │   ├── CapsuleNetworkLayers.py      # Capsule network components
│   │   ├── GNNBase.py                  # Equivariant GNN layers
│   │   ├── Model.py                    # CGNe3 model definition
│   │   ├── Train.py                    # Training pipeline
│   │   └── data/                       # Data utilities
│   │       ├── dataset.py
│   │       └── graph.py
│   └── graph_coordinator.py            # Graph construction utility
├── data/                               # Dataset files
│   ├── e_form.csv
│   └── Metalclasses.csv
├── figures/                            # Documentation figures
├── examples/                           # Example notebooks/scripts
├── docs/                               # Full documentation
├── requirements.txt                    # Dependencies
├── README.md                           # This file                    
└── LICENSE                            # License

Authorship

This work was primarily written by Eddah K. Sure, advised by Prof. Wu Xing and Prof.Qian Quan.

Citation

If you use this code in research, please cite us as:

@article{sure2025,
  title={Capsule Graph Networks for Accurate and
 Interpretable Crystalline Materials Property
 Prediction},
  author={Sure, Eddah K. Xing, Wu and Quan, Qian},
  year={2025}
}

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

GNN + Capsule network for crystalline materials

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 100.0%