Skip to content

WMD-group/Solar_oxides_data

Repository files navigation

DOI

Data repository for the publication: Data-driven discovery of photoactive quaternary oxides using first-principles machine learning

Background

The high-throughput workflow uses a mixture of machine learning, data-driven models and first-principles calculations. The overall aim is to filter through a search space of 1 million quaternary oxide compositions to identify those that fall within a stated stability window, have a bandgap in the range 1.0 - 2.5 eV, and are comprised of earth-abundant elements.

Contents

Notebooks

Steps 1 and 2: Machine learning

  • Train a Gradient Boosting Regressor (GBR) model to predict bandgap from composition
  • Filter newly generated compositions using the GBR model

Step 3: Data-driven filters

  • Rank compositions by sustainability
  • Assign structures
  • Apply oxidation state probability filter

Steps 4 and 5: Thermodynamic stability and electronic properties

  • Thermodynamic stability calculations with high-throughput Density Functional Theory (DFT)
  • Bandgap calculation with hybrid DFT

Data

DOI

The required data can be downloaded separately from the above Zenodo DOI link and should be untarred directly into this directory, creating a sub-directory named data. For the first notebook, a dataset is also required from the CMR.

Dependencies

The notebooks make use of many Python packages:

pip install pymongo pymatgen matminer scikit-learn smact pandas atomate fireworks

Caveats

  • Some notebooks connect to the Materials Project using their API. It is therefore possible that data downloaded fresh may not exactly match data used for the work in the original paper.
  • The GBR model is built from scratch. Due to the randomness deliberately introduced in the training process, the predicted bandgap values of the same composition will vary slightly each time a new model is built.
  • Many different libraries are used and I am not an expert in all of them: some of the code is probably far from elegant!

About

Supplementary data for the quaternary oxide screening

Resources

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •