Skip to content

mluerig/nymphalid-phenomics

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

nymphalid-phenomics


About

This repository contains the analytical pipeline and code to reproduce the results of the paper "Aposematic color patterns are the dominant axis of phenotypic diversification in Nymphalid butterflies" (Lürig et al. 2025). The pipeline combines image acquisition (mostly from online sources), segmentation, feature extraction, and phylogenetically informed statistical analysis to quantify aposematic color patterns in over one third of all species of Nymphalidae.

Lürig, M.D. et al. (2025) - biorxiv preprint

Quickstart

Reproduction of figures and results only - for image analysis pipeline see below.

  1. Download the repo:
  2. Run the analysis scripts:

Image analysis

Overview

For the paper, we executed the full pipeline, which included: 1) image acquisition from GBIF, iDigBio, and our own imaging (also see data_raw/README.md); 2) segmentation (see script 01); 3) image encoding (see script 02); 4) data cleaning; 5) wing surface classification; 6) literature review; and 7) statistical analysis (see scripts 03 and 04). The code for data cleaning and classifier training is not provided here, as these steps involved iterative and manual steps (e.g., using interactive plots), but a detailed description of the complete procedure can be found in the methods section of the manuscript.


Installation

  1. Create the Python environment:

    mamba env create -f environment.yml -n nymphalidae1
    conda activate nymphalidae1
  2. Install packages:

    (in this order)

    UNICOM (image encoder):

    pip install --no-cache-dir "unicom @ git+https://github.com/deepglint/unicom.git@4d84a3b496a47bcad68467d71c5ca787b0366042"

    PyTorch (choose wheel matching your CUDA - may need the --force-reinstall flag):

    pip install --index-url https://download.pytorch.org/whl/cu126 torch torchvision

Download data

A sample of raw images to demonstrate the segmentation step, as well as all segmentation masks for the feature extraction step, and all primary tabular and meta-data are available here: https://doi.org/10.5281/zenodo.17214905.

Structure:

  • data_raw/
    • images_sample/ - Sample of raw images (from GBIF)
    • segmentation_masks_clean/ - Segmented masks (produced by pipeline)
    • segmentation_masks_moths/ - Segmented masks (moth dataset)
    • tables/ - Embeddings and features (produced by pipeline)
  • data/
    • data_primary/ - Primary tabular and meta-data (labels, feature-key, etc.)
    • data_secondary/ - Derived from primary with make_data script (LD-scores, similarity, etc.)
    • analyses_secondary/ - Regressions, phylogenetic modelling, etc.
  • scripts/ - scripts to reproduce all results.
  • figures/ - figures from the manuscript.
  • tables/ - tables from the manuscript.

Analysis

Download the archive, unpack, and run the scripts step by step:

In Python:

In R:

  • 03_make_data.R - assembles specimen-level and species-level tables for analysis.
  • 04_analysis.R - runs all statistical models and generates figures and tables for the paper.

Citation

@ARTICLE{Luerig2025-rp,
  title    = "Aposematic color patterns are the dominant axis of phenotypic
              diversification in Nymphalid butterflies",
  author   = "Luerig, Moritz D and Shirai, Leila T and Mota, Luisa and Willmott,
              Keith R and Freitas, Andre V L and Porto, Arthur",
  journal  = "bioRxiv",
  pages    = "2025.09.30.678834",
  month    =  oct,
  year     =  2025,
  doi      = "10.1101/2025.09.30.678834",
  language = "en"
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

No packages published