Skip to content

⚡️ Speed up method DiGraph._condensation_lil by 80%#107

Open
codeflash-ai[bot] wants to merge 1 commit intomainfrom
codeflash/optimize-DiGraph._condensation_lil-mkp362si
Open

⚡️ Speed up method DiGraph._condensation_lil by 80%#107
codeflash-ai[bot] wants to merge 1 commit intomainfrom
codeflash/optimize-DiGraph._condensation_lil-mkp362si

Conversation

@codeflash-ai
Copy link
Copy Markdown

@codeflash-ai codeflash-ai Bot commented Jan 22, 2026

📄 80% (0.80x) speedup for DiGraph._condensation_lil in quantecon/_graph_tools.py

⏱️ Runtime : 9.55 milliseconds 5.30 milliseconds (best of 110 runs)

📝 Explanation and details

The optimized code achieves an 80% speedup by replacing Python-level iteration with vectorized NumPy operations when processing sparse CSR matrices.

Key Optimizations:

  1. Vectorized CSR Matrix Traversal: The original _csr_matrix_indices() uses nested Python for loops to yield matrix indices one-by-one. The optimized version uses np.repeat() and np.diff() to directly extract all row/column index pairs in vectorized operations, eliminating Python loop overhead entirely.

  2. Batch Processing in _condensation_lil(): Instead of iterating through edges one-at-a-time and setting sparse matrix entries individually, the optimized version:

    • Extracts all edges at once using vectorized operations
    • Maps nodes to their SCC labels using NumPy fancy indexing
    • Filters edges within the same SCC using boolean masking
    • Uses np.unique() to deduplicate edges between SCCs in one operation
    • Sets all condensation edges at once via fancy indexing

Why This is Faster:

  • Eliminates Python Loop Overhead: Python loops are expensive compared to compiled NumPy operations. The original code calls Python's yield statement for each edge, while the optimized version processes all edges in bulk.
  • Memory Locality: Vectorized operations have better cache performance and allow NumPy's optimized C code to run without Python interpreter overhead.
  • Reduced Function Calls: Setting sparse matrix entries in bulk is much faster than individual assignments in a loop.

Performance Characteristics:

The test results show the optimization is particularly effective for:

  • Large sparse graphs (500-1000 nodes): 137-281% speedup, as vectorization benefits scale with input size
  • Graphs with many edges between SCCs: Dense condensations benefit most from batch deduplication via np.unique()

For very small graphs (<10 nodes), the optimization shows 12-57% slowdown due to NumPy array creation overhead dominating the computation. However, since graph algorithms typically target larger datasets where this optimization shines, the trade-off is worthwhile for production workloads.

Impact on Workloads:

This optimization benefits any code analyzing graph structure properties (strongly connected components, condensation graphs) on moderate-to-large directed graphs, which is common in network analysis, Markov chain analysis, and related computational economics applications.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 46 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Click to see Generated Regression Tests
import numpy as np
# imports
import pytest  # used for our unit tests
from quantecon._graph_tools import DiGraph
from scipy import sparse

def _csr_matrix_indices(S):
    """
    Generate the indices of nonzero entries of a csr_matrix S

    """
    m, n = S.shape

    for i in range(m):
        for j in range(S.indptr[i], S.indptr[i+1]):
            row_index, col_index = i, S.indices[j]
            yield row_index, col_index

def test_invalid_non_square_input_raises_value_error():
    # Confirm __init__ enforces square adjacency matrix
    # Provide a 2x3 array which should raise ValueError
    adj = np.zeros((2, 3))
    with pytest.raises(ValueError):
        DiGraph(adj)

def test_csr_index_generator_matches_nonzero_positions():
    # Unit test for the helper _csr_matrix_indices to ensure it yields the
    # correct nonzero index pairs in row-major order for csr matrices.
    arr = np.array([[0, 1, 0],
                    [2, 0, 3],
                    [0, 0, 0]])
    S = sparse.csr_matrix(arr)
    # Collect indices from helper
    helper_indices = list(_csr_matrix_indices(S))
    # Collect indices using dense scan for a deterministic comparison
    dense_indices = [(i, j) for i in range(arr.shape[0]) for j in range(arr.shape[1]) if arr[i, j] != 0]
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
import numpy as np
import pytest
from quantecon._graph_tools import DiGraph
from scipy import sparse

def test_condensation_lil_basic_single_scc():
    """
    Test condensation digraph for a graph with a single strongly connected component.
    The condensation should have shape (1, 1) with no edges.
    """
    # Create a simple graph with 3 nodes forming a single SCC (triangle)
    adj_matrix = np.array([
        [0, 1, 0],
        [0, 0, 1],
        [1, 0, 0]
    ])
    graph = DiGraph(adj_matrix)
    
    # Get the condensation digraph
    codeflash_output = graph._condensation_lil(); condensation = codeflash_output # 99.3μs -> 118μs (16.0% slower)

def test_condensation_lil_basic_two_sccs():
    """
    Test condensation digraph for a graph with two strongly connected components.
    This tests the basic edge case of having multiple SCCs with edges between them.
    """
    # Create a graph with 4 nodes and 2 SCCs
    # SCC 1: nodes 0, 1 (with cycle 0->1->0)
    # SCC 2: nodes 2, 3 (with cycle 2->3->2)
    # Edge from SCC 1 to SCC 2: 1->2
    adj_matrix = np.array([
        [0, 1, 0, 0],
        [1, 0, 1, 0],
        [0, 0, 0, 1],
        [0, 0, 1, 0]
    ])
    graph = DiGraph(adj_matrix)
    
    # Get the condensation digraph
    codeflash_output = graph._condensation_lil(); condensation = codeflash_output # 111μs -> 260μs (57.2% slower)

def test_condensation_lil_basic_no_edges():
    """
    Test condensation digraph for a graph with no edges (each node is its own SCC).
    """
    # Create a graph with 3 isolated nodes
    adj_matrix = np.array([
        [0, 0, 0],
        [0, 0, 0],
        [0, 0, 0]
    ])
    graph = DiGraph(adj_matrix)
    
    # Get the condensation digraph
    codeflash_output = graph._condensation_lil(); condensation = codeflash_output # 85.1μs -> 102μs (17.2% slower)

def test_condensation_lil_basic_single_node():
    """
    Test condensation digraph for a graph with a single node.
    """
    # Create a graph with 1 node and no self-loop
    adj_matrix = np.array([[0]])
    graph = DiGraph(adj_matrix)
    
    # Get the condensation digraph
    codeflash_output = graph._condensation_lil(); condensation = codeflash_output # 87.2μs -> 103μs (16.1% slower)

def test_condensation_lil_basic_single_node_self_loop():
    """
    Test condensation digraph for a graph with a single node with a self-loop.
    """
    # Create a graph with 1 node and a self-loop
    adj_matrix = np.array([[1]])
    graph = DiGraph(adj_matrix)
    
    # Get the condensation digraph
    codeflash_output = graph._condensation_lil(); condensation = codeflash_output # 86.9μs -> 102μs (15.2% slower)

def test_condensation_lil_basic_linear_chain():
    """
    Test condensation digraph for a linear chain graph (1->2->3->4).
    Each node is its own SCC, and the condensation should be a linear chain.
    """
    # Create a linear chain: 0->1->2->3
    adj_matrix = np.array([
        [0, 1, 0, 0],
        [0, 0, 1, 0],
        [0, 0, 0, 1],
        [0, 0, 0, 0]
    ])
    graph = DiGraph(adj_matrix)
    
    # Get the condensation digraph
    codeflash_output = graph._condensation_lil(); condensation = codeflash_output # 119μs -> 271μs (55.8% slower)

def test_condensation_lil_edge_complete_graph():
    """
    Test condensation digraph for a complete graph (all nodes connected to all others).
    In a complete directed graph, all nodes form a single SCC.
    """
    # Create a complete directed graph with 4 nodes
    adj_matrix = np.ones((4, 4))
    np.fill_diagonal(adj_matrix, 0)  # Remove self-loops for clarity
    graph = DiGraph(adj_matrix)
    
    # Get the condensation digraph
    codeflash_output = graph._condensation_lil(); condensation = codeflash_output # 100.0μs -> 114μs (12.7% slower)

def test_condensation_lil_edge_two_node_cycle():
    """
    Test condensation digraph for a simple 2-node cycle.
    Nodes 0 and 1 form a single SCC.
    """
    # Create a 2-node cycle: 0->1->0
    adj_matrix = np.array([
        [0, 1],
        [1, 0]
    ])
    graph = DiGraph(adj_matrix)
    
    # Get the condensation digraph
    codeflash_output = graph._condensation_lil(); condensation = codeflash_output # 95.9μs -> 114μs (16.1% slower)

def test_condensation_lil_edge_multiple_edges_between_sccs():
    """
    Test condensation digraph with multiple edges between the same pair of SCCs.
    The condensation should still have only one edge between those SCCs.
    """
    # Create a graph with multiple edges from one SCC to another
    # SCC 1: nodes 0, 1 (cycle)
    # SCC 2: nodes 2, 3 (cycle)
    # Multiple edges from SCC 1 to SCC 2
    adj_matrix = np.array([
        [0, 1, 0, 0],
        [1, 0, 1, 1],  # Multiple edges to SCC 2
        [0, 0, 0, 1],
        [0, 0, 1, 0]
    ])
    graph = DiGraph(adj_matrix)
    
    # Get the condensation digraph
    codeflash_output = graph._condensation_lil(); condensation = codeflash_output # 116μs -> 257μs (54.8% slower)

def test_condensation_lil_edge_empty_graph():
    """
    Test condensation digraph for an empty graph (no edges).
    """
    # Create a graph with 5 nodes and no edges
    adj_matrix = np.zeros((5, 5))
    graph = DiGraph(adj_matrix)
    
    # Get the condensation digraph
    codeflash_output = graph._condensation_lil(); condensation = codeflash_output # 84.6μs -> 102μs (17.3% slower)

def test_condensation_lil_edge_sparse_adjacency_matrix():
    """
    Test that _condensation_lil correctly handles sparse adjacency matrices.
    The function should work with the sparse representation internally.
    """
    # Create a sparse adjacency matrix explicitly
    adj_matrix = sparse.lil_matrix((5, 5))
    # Add edges: 0->1, 1->2, 2->3, 3->4
    adj_matrix[0, 1] = 1
    adj_matrix[1, 2] = 1
    adj_matrix[2, 3] = 1
    adj_matrix[3, 4] = 1
    
    graph = DiGraph(adj_matrix)
    
    # Get the condensation digraph
    codeflash_output = graph._condensation_lil(); condensation = codeflash_output # 102μs -> 236μs (56.5% slower)

def test_condensation_lil_edge_weighted_graph():
    """
    Test that _condensation_lil works correctly with weighted graphs.
    The weights should not affect the structure of the SCCs.
    """
    # Create a weighted adjacency matrix
    adj_matrix = np.array([
        [0, 2.5, 0],
        [0, 0, 3.7],
        [1.2, 0, 0]
    ])
    graph = DiGraph(adj_matrix, weighted=True)
    
    # Get the condensation digraph
    codeflash_output = graph._condensation_lil(); condensation = codeflash_output # 48.3μs -> 67.2μs (28.0% slower)

def test_condensation_lil_edge_self_loops_ignored():
    """
    Test that self-loops are correctly ignored in the condensation digraph.
    Self-loops within an SCC should not create edges in the condensation.
    """
    # Create a graph with self-loops
    adj_matrix = np.array([
        [1, 1, 0],  # Node 0 has a self-loop and edge to node 1
        [1, 1, 1],  # Node 1 has a self-loop, edge to 0, and edge to 2
        [0, 0, 1]   # Node 2 has a self-loop
    ])
    graph = DiGraph(adj_matrix)
    
    # Get the condensation digraph
    codeflash_output = graph._condensation_lil(); condensation = codeflash_output # 110μs -> 252μs (56.3% slower)

def test_condensation_lil_edge_complex_scc_structure():
    """
    Test condensation digraph with a more complex SCC structure.
    Multiple SCCs with various connections.
    """
    # Graph structure:
    # SCC 1: nodes 0, 1 (0->1->0)
    # SCC 2: nodes 2, 3, 4 (2->3->4->2)
    # SCC 3: node 5 (isolated)
    # Edges: SCC1->SCC2, SCC2->SCC3
    adj_matrix = np.array([
        [0, 1, 0, 0, 0, 0],  # 0 -> 1
        [1, 0, 1, 0, 0, 0],  # 1 -> 0, 1 -> 2
        [0, 0, 0, 1, 0, 0],  # 2 -> 3
        [0, 0, 0, 0, 1, 0],  # 3 -> 4
        [0, 0, 1, 0, 0, 1],  # 4 -> 2, 4 -> 5
        [0, 0, 0, 0, 0, 0]   # 5 (isolated)
    ])
    graph = DiGraph(adj_matrix)
    
    # Get the condensation digraph
    codeflash_output = graph._condensation_lil(); condensation = codeflash_output # 116μs -> 266μs (56.5% slower)

def test_condensation_lil_edge_large_complete_subgraph():
    """
    Test condensation digraph with a large complete subgraph connected to isolated nodes.
    """
    # Create a 10-node complete digraph (all nodes form one SCC)
    n = 10
    adj_matrix = np.ones((n, n)) - np.eye(n)
    graph = DiGraph(adj_matrix)
    
    # Get the condensation digraph
    codeflash_output = graph._condensation_lil(); condensation = codeflash_output # 125μs -> 113μs (10.5% faster)

def test_condensation_lil_large_scale_many_nodes():
    """
    Test condensation digraph with a large number of nodes (linear chain).
    Each node is its own SCC, creating a large condensation graph.
    """
    # Create a linear chain of 500 nodes
    n = 500
    adj_matrix = np.zeros((n, n))
    for i in range(n - 1):
        adj_matrix[i, i + 1] = 1
    
    graph = DiGraph(adj_matrix)
    
    # Get the condensation digraph
    codeflash_output = graph._condensation_lil(); condensation = codeflash_output # 2.02ms -> 578μs (249% faster)

def test_condensation_lil_large_scale_many_sccs_few_edges():
    """
    Test condensation digraph with many SCCs and sparse connections between them.
    """
    # Create a graph with 200 SCCs (each of size 2) with sparse connections
    num_sccs = 200
    n = num_sccs * 2
    adj_matrix = np.zeros((n, n))
    
    # Create SCCs: each SCC i consists of nodes (2*i, 2*i+1) with edges 2*i -> 2*i+1 -> 2*i
    for i in range(num_sccs):
        node1 = 2 * i
        node2 = 2 * i + 1
        adj_matrix[node1, node2] = 1
        adj_matrix[node2, node1] = 1
    
    # Add sparse edges between SCCs (from SCC i to SCC i+1)
    for i in range(num_sccs - 1):
        adj_matrix[2 * i + 1, 2 * (i + 1)] = 1
    
    graph = DiGraph(adj_matrix)
    
    # Get the condensation digraph
    codeflash_output = graph._condensation_lil(); condensation = codeflash_output # 1.10ms -> 399μs (174% faster)

def test_condensation_lil_large_scale_dense_connections():
    """
    Test condensation digraph with many SCCs and dense connections between them.
    """
    # Create a graph with 50 SCCs (each of size 3) with many edges between them
    num_sccs = 50
    scc_size = 3
    n = num_sccs * scc_size
    adj_matrix = np.zeros((n, n))
    
    # Create SCCs: each SCC forms a complete digraph
    for scc_idx in range(num_sccs):
        for i in range(scc_size):
            for j in range(scc_size):
                if i != j:
                    node_i = scc_idx * scc_size + i
                    node_j = scc_idx * scc_size + j
                    adj_matrix[node_i, node_j] = 1
    
    # Add edges between SCCs: from each node in SCC i to nodes in SCC i+1
    for scc_idx in range(num_sccs - 1):
        for i in range(scc_size):
            for j in range(scc_size):
                node_from = scc_idx * scc_size + i
                node_to = (scc_idx + 1) * scc_size + j
                adj_matrix[node_from, node_to] = 1
    
    graph = DiGraph(adj_matrix)
    
    # Get the condensation digraph
    codeflash_output = graph._condensation_lil(); condensation = codeflash_output # 1.73ms -> 454μs (281% faster)

def test_condensation_lil_large_scale_sparse_matrix_efficiency():
    """
    Test that _condensation_lil correctly handles large sparse matrices efficiently.
    """
    # Create a large sparse graph (1000 nodes) with very few edges
    n = 1000
    adj_matrix = sparse.lil_matrix((n, n), dtype=bool)
    
    # Add only 100 edges in a specific pattern
    for i in range(100):
        adj_matrix[i, i + 1] = True
        if i + 500 < n:
            adj_matrix[i + 500, i + 501] = True
    
    graph = DiGraph(adj_matrix)
    
    # Get the condensation digraph
    codeflash_output = graph._condensation_lil(); condensation = codeflash_output # 1.22ms -> 516μs (137% faster)

def test_condensation_lil_large_scale_cyclic_structure():
    """
    Test condensation digraph with a large cyclic structure where many nodes form cycles.
    """
    # Create a graph with cycles of different lengths
    n = 400
    adj_matrix = np.zeros((n, n))
    
    # Create cycles of length 10, 20, 30, etc.
    cycle_start = 0
    cycle_length = 10
    while cycle_start + cycle_length <= n:
        for i in range(cycle_length - 1):
            adj_matrix[cycle_start + i, cycle_start + i + 1] = 1
        # Close the cycle
        adj_matrix[cycle_start + cycle_length - 1, cycle_start] = 1
        cycle_start += cycle_length
        cycle_length += 10
    
    graph = DiGraph(adj_matrix)
    
    # Get the condensation digraph
    codeflash_output = graph._condensation_lil(); condensation = codeflash_output # 339μs -> 142μs (137% faster)

def test_condensation_lil_large_scale_result_type_and_dtype():
    """
    Test that _condensation_lil returns the correct matrix type and data type for large graphs.
    """
    # Create a moderately large graph
    n = 250
    adj_matrix = np.zeros((n, n))
    
    # Create a structure with multiple SCCs
    for i in range(0, n - 1, 2):
        adj_matrix[i, i + 1] = 1
        adj_matrix[i + 1, i] = 1
    
    # Add edges between SCC pairs
    for i in range(0, n - 2, 2):
        adj_matrix[i + 1, i + 2] = 1
    
    graph = DiGraph(adj_matrix)
    
    # Get the condensation digraph
    codeflash_output = graph._condensation_lil(); condensation = codeflash_output # 745μs -> 346μs (115% faster)

def test_condensation_lil_large_scale_with_node_labels():
    """
    Test that _condensation_lil works correctly with node labels in large graphs.
    """
    # Create a graph with node labels
    n = 200
    adj_matrix = np.zeros((n, n))
    
    # Create a simple structure
    for i in range(n - 1):
        adj_matrix[i, i + 1] = 1
    
    # Create node labels
    node_labels = np.array([f'node_{i}' for i in range(n)])
    
    graph = DiGraph(adj_matrix, node_labels=node_labels)
    
    # Get the condensation digraph
    codeflash_output = graph._condensation_lil(); condensation = codeflash_output # 903μs -> 379μs (138% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-DiGraph._condensation_lil-mkp362si and push.

Codeflash Static Badge

The optimized code achieves an 80% speedup by replacing Python-level iteration with vectorized NumPy operations when processing sparse CSR matrices.

**Key Optimizations:**

1. **Vectorized CSR Matrix Traversal**: The original `_csr_matrix_indices()` uses nested Python `for` loops to yield matrix indices one-by-one. The optimized version uses `np.repeat()` and `np.diff()` to directly extract all row/column index pairs in vectorized operations, eliminating Python loop overhead entirely.

2. **Batch Processing in `_condensation_lil()`**: Instead of iterating through edges one-at-a-time and setting sparse matrix entries individually, the optimized version:
   - Extracts all edges at once using vectorized operations
   - Maps nodes to their SCC labels using NumPy fancy indexing
   - Filters edges within the same SCC using boolean masking
   - Uses `np.unique()` to deduplicate edges between SCCs in one operation
   - Sets all condensation edges at once via fancy indexing

**Why This is Faster:**

- **Eliminates Python Loop Overhead**: Python loops are expensive compared to compiled NumPy operations. The original code calls Python's `yield` statement for each edge, while the optimized version processes all edges in bulk.
- **Memory Locality**: Vectorized operations have better cache performance and allow NumPy's optimized C code to run without Python interpreter overhead.
- **Reduced Function Calls**: Setting sparse matrix entries in bulk is much faster than individual assignments in a loop.

**Performance Characteristics:**

The test results show the optimization is particularly effective for:
- **Large sparse graphs** (500-1000 nodes): 137-281% speedup, as vectorization benefits scale with input size
- **Graphs with many edges between SCCs**: Dense condensations benefit most from batch deduplication via `np.unique()`

For very small graphs (<10 nodes), the optimization shows 12-57% *slowdown* due to NumPy array creation overhead dominating the computation. However, since graph algorithms typically target larger datasets where this optimization shines, the trade-off is worthwhile for production workloads.

**Impact on Workloads:**

This optimization benefits any code analyzing graph structure properties (strongly connected components, condensation graphs) on moderate-to-large directed graphs, which is common in network analysis, Markov chain analysis, and related computational economics applications.
@codeflash-ai codeflash-ai Bot requested a review from aseembits93 January 22, 2026 06:44
@codeflash-ai codeflash-ai Bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Jan 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants