A web-based cheminformatics teaching tool for exploring molecular connection matrices and understanding structure elucidation algorithms.
WebLucy is an interactive webapp designed to teach students how molecular structures can be systematically generated from a molecular formula. It visualizes the process of bond insertion in a connection matrix, demonstrating the depth-first search algorithm used in structure elucidation.
The algorithm is based on the original LUCY (semiautomatische Strukturaufklärung) software [1]. The bond generation algorithm is extremely basic and not suited for the generation of larger molecules due to the lack of canonicalization. In the context of NMR-generated constraints, however, it works reasonably well.
For production-level structure generation from molecular formulas, consider using SURGE (Structure Generator), a highly efficient and comprehensive tool for exhaustive structure enumeration with advanced canonicalization and symmetry handling.
- Molecular Formula Input: Enter any molecular formula (e.g., C5H12, C2H6O, C6H6)
- Connection Matrix Visualization: See the NxN matrix of heavy atoms with color-coded bond orders
- Editable Hydrogen Distribution: Freely adjust how hydrogens are distributed among heavy atoms
- Step-by-Step Algorithm: Walk through the bond insertion process with Forward/Back buttons
- Connectivity Validation: Only connected molecules are accepted as valid structures
- 2D Structure Depiction: Valid molecules are displayed using the Cheminformatics Microservice [2]
- Structure Counter: Track how many valid structures have been found (shown in panel heading)
- Continuous Exploration: Auto-step continues through all bond configurations until exhausted
- Adjustable Speed: Choose from Slow (500ms), Medium (250ms), Fast (100ms), or Very Fast (50ms)
- Stop Anytime: Click the Stop button to pause auto-stepping at any point
- Ascending Order (1→2→3): Try single bonds first, then double, then triple (default)
- Descending Order (3→2→1): Try triple bonds first, then double, then single
- Useful for demonstrating how bond order affects structure discovery
- Batch Generation: Generate all valid structures for the current formula
- Unique H-Distributions: Automatically enumerates all canonical hydrogen distributions
- SMILES Export: Downloads all structures as a zipped SMILES file
- Includes Duplicates: Intentionally preserves duplicates to demonstrate the need for canonicalization
-
Enter a molecular formula - The app parses the formula and creates a connection matrix for heavy atoms (non-hydrogen atoms)
-
Adjust hydrogen distribution - Edit how many hydrogens are attached to each atom. The sum must match the formula (validated when you click Forward or Auto Step).
-
Select options - Choose auto-step speed and bond order sequence (ascending or descending)
-
Explore bond configurations - Use Forward/Back buttons or Auto Step to traverse the depth-first search:
- The algorithm traverses the lower triangle of the matrix column by column
- At each position, it tries bond orders in the selected sequence
- Valence constraints are checked at each step
- When the matrix is complete, connectivity is verified
-
View valid structures - When a valid, connected structure is found, it's displayed as a 2D diagram with a structure count
-
Silent Generation - Or click "Silent Generation" to enumerate all valid structures and download them as SMILES
Simply open index.html in a web browser:
open index.htmlOr serve via a local HTTP server:
python3 -m http.server 8000Then open http://localhost:8000
- Kekule.js - For molecule representation and SMILES generation
- JSZip - For creating downloadable zip files
- Cheminformatics Microservice [2] - For 2D structure depiction
The bond insertion algorithm performs a depth-first traversal of possible bond configurations.
FUNCTION GenerateStructures(matrix, position):
IF position is beyond last cell THEN
IF all valences satisfied AND molecule is connected THEN
OUTPUT valid structure
END IF
RETURN
END IF
FOR each bond_order IN sequence (1,2,3,0) or (3,2,1,0):
IF bond_order respects valence constraints THEN
SET matrix[position] = bond_order
GenerateStructures(matrix, next_position) // Recurse
SET matrix[position] = 0 // Backtrack
END IF
END FOR
END FUNCTION
FUNCTION IsConnected(matrix):
visited = {atom_1}
stack = [atom_1]
WHILE stack not empty:
atom = stack.pop()
FOR each neighbor bonded to atom:
IF neighbor not in visited THEN
visited.add(neighbor)
stack.push(neighbor)
END IF
END FOR
END WHILE
RETURN |visited| == N
END FUNCTION
The algorithm fills the lower triangle of the N×N connection matrix column by column:
Position sequence: (2,1) → (3,1) → ... → (N,1) → (3,2) → (4,2) → ... → (N,N-1)
MIT License
Christoph Steinbeck
[1] Steinbeck C. LUCY—A Program for Structure Elucidation from NMR Correlation Experiments. Angew. Chem. Int. Ed. Engl. 1996, 35(17), 1984-1986. DOI: 10.1002/anie.199619841
[2] Rajan K, Chandrasekhar V, Sharma N, Kanakam SRS, Steinbeck C. Cheminformatics Microservice: unifying access to open cheminformatics toolkits. J Cheminform 2023, 15, 107. DOI: 10.1186/s13321-023-00762-4
- Kekule.js by Partridge Jiang
- JSZip by Stuart Knightley
- Cheminformatics Microservice (Web App) by the Steinbeck Lab