[EfficientLoFTR] LRU Cached embedding computation at inference#40329
[EfficientLoFTR] LRU Cached embedding computation at inference#40329qubvel merged 6 commits intohuggingface:mainfrom
Conversation
qubvel
left a comment
There was a problem hiding this comment.
Thanks, just a question, otherwise looks good 👍
| i_indices = torch.ones(embed_height, embed_width).cumsum(0).float().unsqueeze(-1) | ||
| j_indices = torch.ones(embed_height, embed_width).cumsum(1).float().unsqueeze(-1) |
There was a problem hiding this comment.
device / dtype is also needed, no?
There was a problem hiding this comment.
compute_embeddings is only used inside the torch.autocast context manager so I assume we don't need it ?
| i_indices = torch.ones(embed_height, embed_width).cumsum(0).float().unsqueeze(-1) | ||
| j_indices = torch.ones(embed_height, embed_width).cumsum(1).float().unsqueeze(-1) | ||
|
|
||
| emb = torch.zeros(1, embed_height, embed_width, hidden_size // 2) |
|
@qubvel gentle bump! I answered your comment |
|
Thanks for the ping! Missed it. I still think we should pass |
|
Adressed in 3fa6b94 |
|
run-slow: efficientloftr |
|
This comment contains run-slow, running the specified jobs: models: ['models/efficientloftr'] |
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
|
You can rerun slow tests, it should be fixed, I forgot the features were aggregated before the attention, so the embeddings need to be of correct size |
|
run-slow: efficientloftr |
|
This comment contains run-slow, running the specified jobs: models: ['models/efficientloftr'] |
|
[For maintainers] Suggested jobs to run (before merge) run-slow: efficientloftr |
| if self.hidden_size != self.out_features[-1]: | ||
| raise ValueError( | ||
| f"hidden_size should be equal to the last value in out_features. hidden_size = {self.hidden_size}, out_features = {self.stage_out_channels}" | ||
| f"hidden_size should be equal to the last value in out_features. hidden_size = {self.hidden_size}, out_features = {self.out_features[-1]}" |
| out_features: list[int] = [32, 32, 128], | ||
| stage_stride: list[int] = [2, 1, 2], | ||
| q_aggregation_kernel_size: int = 1, | ||
| kv_aggregation_kernel_size: int = 1, | ||
| q_aggregation_stride: int = 1, | ||
| kv_aggregation_stride: int = 1, | ||
| num_attention_layers: int = 2, | ||
| num_attention_heads: int = 8, | ||
| hidden_size: int = 64, | ||
| hidden_size: int = 128, |
There was a problem hiding this comment.
that's for FA2 tests to pass
There was a problem hiding this comment.
How didn't I catch this one before ? 🤔 But thanks for taking care of it !
There was a problem hiding this comment.
That's skipped locally in case you don't have FA2 installed 😄
|
run-slow: efficientloftr |
|
This comment contains run-slow, running the specified jobs: models: ['models/efficientloftr'] |
What does this PR do?
Reverted EfficientLoFTR RoPE computation back to inference time to allow different image sizes usage
Decorated with LRU Cache similarly to DinoV3 implementation
Who can review?
@qubvel