Training MatterGen on Custom Crystal Structures for Bandgap

I want to train the MatterGen model on my own dataset to generate new crystal structures based on targeted bandgap values. To prepare the data, I created a CSV file where each row contains the name of the compound or folder, the full content of the CIF file (not just a path), and the corresponding DFT-computed bandgap value . My goal is for the model to learn structure–property relationships and generate novel materials with desired electronic properties. My question is: is this data format sufficient for training MatterGen effectively, or are there additional preprocessing steps or formatting requirements needed to ensure the model can learn and generate materials conditioned on the bandgap? Also, when I have my custom dataset, how do I write the command to train the model? `mattergen-train data_module=mp_20 ~trainer.logger`

<img width="480" alt="Image" src="https://github.com/user-attachments/assets/2297bf9f-1f5e-48a0-ac46-2e762c4503d9" />

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training MatterGen on Custom Crystal Structures for Bandgap #128

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Training MatterGen on Custom Crystal Structures for Bandgap #128

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions