Skip to content

Add RandomTreesEmbedding feature#1839

Merged
MaxHalford merged 2 commits intoonline-ml:mainfrom
kulbachcedric:1386-random-trees-embeddings
Apr 28, 2026
Merged

Add RandomTreesEmbedding feature#1839
MaxHalford merged 2 commits intoonline-ml:mainfrom
kulbachcedric:1386-random-trees-embeddings

Conversation

@kulbachcedric
Copy link
Copy Markdown
Contributor

Hi @MaxHalford,
if I understood the RamdomTreeEmbeddings correctly, this PR should implement them.
The crucial lines are the following, where the transforms_one methods returns a dictionary showing which leaf x ends up in for each tree.

features = {}
for tree_id, tree in enumerate(self.trees):
    leaf = tree if isinstance(tree, HSTLeaf) else tree.traverse(x)
    features[(tree_id, self._leaf_indices[tree_id][id(leaf)])] = 1.0
return features 

Could you please briefly check it ?

@kulbachcedric kulbachcedric linked an issue Apr 27, 2026 that may be closed by this pull request
Copy link
Copy Markdown
Member

@MaxHalford MaxHalford left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! The one thing that's missing is that the implementation assumes features are scaled between [0, 1]. We need to document in every place where this assumption is made that the user should use a MinMaxScaler prior. See what I mean?

@MaxHalford MaxHalford mentioned this pull request Apr 28, 2026
@MaxHalford MaxHalford merged commit 49a7448 into online-ml:main Apr 28, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Random trees embeddings

2 participants