Skip to content

Skip range_extension_c for pure nodes in Mondrian classifier#1841

Merged
MaxHalford merged 1 commit intomainfrom
mondrian-skip-range-ext-pure
Apr 28, 2026
Merged

Skip range_extension_c for pure nodes in Mondrian classifier#1841
MaxHalford merged 1 commit intomainfrom
mondrian-skip-range-ext-pure

Conversation

@MaxHalford
Copy link
Copy Markdown
Member

Summary

  • When split_pure=False (default), moves the purity check before the range_extension_c call in the classifier's downward pass. Pure nodes never split, so the range extension computation can be skipped entirely.
  • Credit to @shi-zq for the observation.

Benchmark results

range_extension_c cost scales linearly with feature count (~1μs at 20 features, ~10μs at 200).

Pure nodes are ~50% of all nodes (exclusively leaves — branches are always impure). The optimization saves one range_extension_c call per learn_one at the terminal pure leaf.

Config Baseline Optimized Delta
20 features, 3 classes 0.1781s 0.1774s -0.4%
50 features, 3 classes 0.4067s 0.3876s -4.7%
100 features, 3 classes 0.6655s 0.6425s -3.5%
200 features, 3 classes 1.0568s 1.0203s -3.5%
50 features, 2 classes 0.4338s 0.4128s -4.8%
50 features, 5 classes 0.4071s 0.3891s -4.4%
50 features, 10 classes 0.3868s 0.3692s -4.6%

(5000 samples, best of 5 runs, split_pure=False, seed=42)

Test plan

  • pytest river/tree/mondrian/ passes (including doctests)
  • CI green

🤖 Generated with Claude Code

When split_pure=False (default), check node purity before computing
range extensions. Pure nodes will never split, so the expensive
range_extension_c call can be skipped entirely. Benchmarks show ~3-5%
speedup on datasets with 50+ features.

Credit: @shi-zq for the observation in #1835.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@MaxHalford MaxHalford requested a review from smastelini as a code owner April 28, 2026 13:55
@MaxHalford MaxHalford mentioned this pull request Apr 28, 2026
@MaxHalford MaxHalford merged commit 87aa067 into main Apr 28, 2026
1 check passed
@MaxHalford MaxHalford deleted the mondrian-skip-range-ext-pure branch April 28, 2026 14:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant