Skip to content

WebLucy - Cheminformatics teaching tool for exploring molecular connection matrices

Notifications You must be signed in to change notification settings

steinbeck/weblucy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 

Repository files navigation

WebLucy

A web-based cheminformatics teaching tool for exploring molecular connection matrices and understanding structure elucidation algorithms.

Try WebLucy Online

Overview

WebLucy is an interactive webapp designed to teach students how molecular structures can be systematically generated from a molecular formula. It visualizes the process of bond insertion in a connection matrix, demonstrating the depth-first search algorithm used in structure elucidation.

The algorithm is based on the original LUCY (semiautomatische Strukturaufklärung) software [1]. The bond generation algorithm is extremely basic and not suited for the generation of larger molecules due to the lack of canonicalization. In the context of NMR-generated constraints, however, it works reasonably well.

For production-level structure generation from molecular formulas, consider using SURGE (Structure Generator), a highly efficient and comprehensive tool for exhaustive structure enumeration with advanced canonicalization and symmetry handling.

Features

Core Functionality

  • Molecular Formula Input: Enter any molecular formula (e.g., C5H12, C2H6O, C6H6)
  • Connection Matrix Visualization: See the NxN matrix of heavy atoms with color-coded bond orders
  • Editable Hydrogen Distribution: Freely adjust how hydrogens are distributed among heavy atoms
  • Step-by-Step Algorithm: Walk through the bond insertion process with Forward/Back buttons
  • Connectivity Validation: Only connected molecules are accepted as valid structures
  • 2D Structure Depiction: Valid molecules are displayed using the Cheminformatics Microservice [2]
  • Structure Counter: Track how many valid structures have been found (shown in panel heading)

Auto-Step Mode

  • Continuous Exploration: Auto-step continues through all bond configurations until exhausted
  • Adjustable Speed: Choose from Slow (500ms), Medium (250ms), Fast (100ms), or Very Fast (50ms)
  • Stop Anytime: Click the Stop button to pause auto-stepping at any point

Bond Order Options

  • Ascending Order (1→2→3): Try single bonds first, then double, then triple (default)
  • Descending Order (3→2→1): Try triple bonds first, then double, then single
  • Useful for demonstrating how bond order affects structure discovery

Silent Generation

  • Batch Generation: Generate all valid structures for the current formula
  • Unique H-Distributions: Automatically enumerates all canonical hydrogen distributions
  • SMILES Export: Downloads all structures as a zipped SMILES file
  • Includes Duplicates: Intentionally preserves duplicates to demonstrate the need for canonicalization

How It Works

  1. Enter a molecular formula - The app parses the formula and creates a connection matrix for heavy atoms (non-hydrogen atoms)

  2. Adjust hydrogen distribution - Edit how many hydrogens are attached to each atom. The sum must match the formula (validated when you click Forward or Auto Step).

  3. Select options - Choose auto-step speed and bond order sequence (ascending or descending)

  4. Explore bond configurations - Use Forward/Back buttons or Auto Step to traverse the depth-first search:

    • The algorithm traverses the lower triangle of the matrix column by column
    • At each position, it tries bond orders in the selected sequence
    • Valence constraints are checked at each step
    • When the matrix is complete, connectivity is verified
  5. View valid structures - When a valid, connected structure is found, it's displayed as a 2D diagram with a structure count

  6. Silent Generation - Or click "Silent Generation" to enumerate all valid structures and download them as SMILES

Running Locally

Simply open index.html in a web browser:

open index.html

Or serve via a local HTTP server:

python3 -m http.server 8000

Then open http://localhost:8000

Dependencies

Algorithm

The bond insertion algorithm performs a depth-first traversal of possible bond configurations.

Pseudo-code

FUNCTION GenerateStructures(matrix, position):
    IF position is beyond last cell THEN
        IF all valences satisfied AND molecule is connected THEN
            OUTPUT valid structure
        END IF
        RETURN
    END IF

    FOR each bond_order IN sequence (1,2,3,0) or (3,2,1,0):
        IF bond_order respects valence constraints THEN
            SET matrix[position] = bond_order
            GenerateStructures(matrix, next_position)    // Recurse
            SET matrix[position] = 0                     // Backtrack
        END IF
    END FOR
END FUNCTION

FUNCTION IsConnected(matrix):
    visited = {atom_1}
    stack = [atom_1]
    WHILE stack not empty:
        atom = stack.pop()
        FOR each neighbor bonded to atom:
            IF neighbor not in visited THEN
                visited.add(neighbor)
                stack.push(neighbor)
            END IF
        END FOR
    END WHILE
    RETURN |visited| == N
END FUNCTION

Traversal Order

The algorithm fills the lower triangle of the N×N connection matrix column by column:

Position sequence: (2,1) → (3,1) → ... → (N,1) → (3,2) → (4,2) → ... → (N,N-1)

License

MIT License

Author

Christoph Steinbeck

References

[1] Steinbeck C. LUCY—A Program for Structure Elucidation from NMR Correlation Experiments. Angew. Chem. Int. Ed. Engl. 1996, 35(17), 1984-1986. DOI: 10.1002/anie.199619841

[2] Rajan K, Chandrasekhar V, Sharma N, Kanakam SRS, Steinbeck C. Cheminformatics Microservice: unifying access to open cheminformatics toolkits. J Cheminform 2023, 15, 107. DOI: 10.1186/s13321-023-00762-4

Acknowledgments

About

WebLucy - Cheminformatics teaching tool for exploring molecular connection matrices

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •