diff --git a/docs/source/en/model_doc/superpoint.md b/docs/source/en/model_doc/superpoint.md
index aa22d30961ad..31f40e5a374e 100644
--- a/docs/source/en/model_doc/superpoint.md
+++ b/docs/source/en/model_doc/superpoint.md
@@ -10,48 +10,35 @@ specific language governing permissions and limitations under the License.
⚠️ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be
rendered properly in your Markdown viewer.
-
-->
-# SuperPoint
-
-
-

+
+
+

+
-## Overview
-
-The SuperPoint model was proposed
-in [SuperPoint: Self-Supervised Interest Point Detection and Description](https://huggingface.co/papers/1712.07629) by Daniel
-DeTone, Tomasz Malisiewicz and Andrew Rabinovich.
-
-This model is the result of a self-supervised training of a fully-convolutional network for interest point detection and
-description. The model is able to detect interest points that are repeatable under homographic transformations and
-provide a descriptor for each point. The use of the model in its own is limited, but it can be used as a feature
-extractor for other tasks such as homography estimation, image matching, etc.
-
-The abstract from the paper is the following:
+# SuperPoint
-*This paper presents a self-supervised framework for training interest point detectors and descriptors suitable for a
-large number of multiple-view geometry problems in computer vision. As opposed to patch-based neural networks, our
-fully-convolutional model operates on full-sized images and jointly computes pixel-level interest point locations and
-associated descriptors in one forward pass. We introduce Homographic Adaptation, a multi-scale, multi-homography
-approach for boosting interest point detection repeatability and performing cross-domain adaptation (e.g.,
-synthetic-to-real). Our model, when trained on the MS-COCO generic image dataset using Homographic Adaptation, is able
-to repeatedly detect a much richer set of interest points than the initial pre-adapted deep model and any other
-traditional corner detector. The final system gives rise to state-of-the-art homography estimation results on HPatches
-when compared to LIFT, SIFT and ORB.*
+[SuperPoint](https://huggingface.co/papers/1712.07629) is the result of self-supervised training of a fully-convolutional network for interest point detection and description. The model is able to detect interest points that are repeatable under homographic transformations and provide a descriptor for each point. Usage on it's own is limited, but it can be used as a feature extractor for other tasks such as homography estimation and image matching.

-
SuperPoint overview. Taken from the original paper.
+You can find all the original SuperPoint checkpoints under the [Magic Leap Community](https://huggingface.co/magic-leap-community) organization.
-## Usage tips
+> [!TIP]
+> This model was contributed by [stevenbucaille](https://huggingface.co/stevenbucaille).
+>
+> Click on the SuperPoint models in the right sidebar for more examples of how to apply SuperPoint to different computer vision tasks.
-Here is a quick example of using the model to detect interest points in an image:
-```python
+
+The example below demonstrates how to detect interest points in an image with the [`AutoModel`] class.
+
+
+
+```py
from transformers import AutoImageProcessor, SuperPointForKeypointDetection
import torch
from PIL import Image
@@ -64,67 +51,76 @@ processor = AutoImageProcessor.from_pretrained("magic-leap-community/superpoint"
model = SuperPointForKeypointDetection.from_pretrained("magic-leap-community/superpoint")
inputs = processor(image, return_tensors="pt")
-outputs = model(**inputs)
-```
-
-The outputs contain the list of keypoint coordinates with their respective score and description (a 256-long vector).
-
-You can also feed multiple images to the model. Due to the nature of SuperPoint, to output a dynamic number of keypoints,
-you will need to use the mask attribute to retrieve the respective information :
-
-```python
-from transformers import AutoImageProcessor, SuperPointForKeypointDetection
-import torch
-from PIL import Image
-import requests
-
-url_image_1 = "http://images.cocodataset.org/val2017/000000039769.jpg"
-image_1 = Image.open(requests.get(url_image_1, stream=True).raw)
-url_image_2 = "http://images.cocodataset.org/test-stuff2017/000000000568.jpg"
-image_2 = Image.open(requests.get(url_image_2, stream=True).raw)
-
-images = [image_1, image_2]
-
-processor = AutoImageProcessor.from_pretrained("magic-leap-community/superpoint")
-model = SuperPointForKeypointDetection.from_pretrained("magic-leap-community/superpoint")
-
-inputs = processor(images, return_tensors="pt")
-outputs = model(**inputs)
-image_sizes = [(image.height, image.width) for image in images]
-outputs = processor.post_process_keypoint_detection(outputs, image_sizes)
-
-for output in outputs:
- for keypoints, scores, descriptors in zip(output["keypoints"], output["scores"], output["descriptors"]):
- print(f"Keypoints: {keypoints}")
- print(f"Scores: {scores}")
- print(f"Descriptors: {descriptors}")
-```
+with torch.no_grad():
+ outputs = model(**inputs)
-You can then print the keypoints on the image of your choice to visualize the result:
-```python
-import matplotlib.pyplot as plt
-
-plt.axis("off")
-plt.imshow(image_1)
-plt.scatter(
- outputs[0]["keypoints"][:, 0],
- outputs[0]["keypoints"][:, 1],
- c=outputs[0]["scores"] * 100,
- s=outputs[0]["scores"] * 50,
- alpha=0.8
-)
-plt.savefig(f"output_image.png")
+# Post-process to get keypoints, scores, and descriptors
+image_size = (image.height, image.width)
+processed_outputs = processor.post_process_keypoint_detection(outputs, [image_size])
```
-
-This model was contributed by [stevenbucaille](https://huggingface.co/stevenbucaille).
-The original code can be found [here](https://github.com/magicleap/SuperPointPretrainedNetwork).
+
+
+
+## Notes
+
+- SuperPoint outputs a dynamic number of keypoints per image, which makes it suitable for tasks requiring variable-length feature representations.
+
+ ```py
+ from transformers import AutoImageProcessor, SuperPointForKeypointDetection
+ import torch
+ from PIL import Image
+ import requests
+ processor = AutoImageProcessor.from_pretrained("magic-leap-community/superpoint")
+ model = SuperPointForKeypointDetection.from_pretrained("magic-leap-community/superpoint")
+ url_image_1 = "http://images.cocodataset.org/val2017/000000039769.jpg"
+ image_1 = Image.open(requests.get(url_image_1, stream=True).raw)
+ url_image_2 = "http://images.cocodataset.org/test-stuff2017/000000000568.jpg"
+ image_2 = Image.open(requests.get(url_image_2, stream=True).raw)
+ images = [image_1, image_2]
+ inputs = processor(images, return_tensors="pt")
+ # Example of handling dynamic keypoint output
+ outputs = model(**inputs)
+ keypoints = outputs.keypoints # Shape varies per image
+ scores = outputs.scores # Confidence scores for each keypoint
+ descriptors = outputs.descriptors # 256-dimensional descriptors
+ mask = outputs.mask # Value of 1 corresponds to a keypoint detection
+ ```
+
+- The model provides both keypoint coordinates and their corresponding descriptors (256-dimensional vectors) in a single forward pass.
+- For batch processing with multiple images, you need to use the mask attribute to retrieve the respective information for each image. You can use the `post_process_keypoint_detection` from the `SuperPointImageProcessor` to retrieve the each image information.
+
+ ```py
+ # Batch processing example
+ images = [image1, image2, image3]
+ inputs = processor(images, return_tensors="pt")
+ outputs = model(**inputs)
+ image_sizes = [(img.height, img.width) for img in images]
+ processed_outputs = processor.post_process_keypoint_detection(outputs, image_sizes)
+ ```
+
+- You can then print the keypoints on the image of your choice to visualize the result:
+ ```py
+ import matplotlib.pyplot as plt
+ plt.axis("off")
+ plt.imshow(image_1)
+ plt.scatter(
+ outputs[0]["keypoints"][:, 0],
+ outputs[0]["keypoints"][:, 1],
+ c=outputs[0]["scores"] * 100,
+ s=outputs[0]["scores"] * 50,
+ alpha=0.8
+ )
+ plt.savefig(f"output_image.png")
+ ```
+
+
+

+
## Resources
-A list of official Hugging Face and community (indicated by 🌎) resources to help you get started with SuperPoint. If you're interested in submitting a resource to be included here, please feel free to open a Pull Request and we'll review it! The resource should ideally demonstrate something new instead of duplicating an existing resource.
-
-- A notebook showcasing inference and visualization with SuperPoint can be found [here](https://github.com/NielsRogge/Transformers-Tutorials/blob/master/SuperPoint/Inference_with_SuperPoint_to_detect_interest_points_in_an_image.ipynb). 🌎
+- Refer to this [noteboook](https://github.com/NielsRogge/Transformers-Tutorials/blob/master/SuperPoint/Inference_with_SuperPoint_to_detect_interest_points_in_an_image.ipynb) for an inference and visualization example.
## SuperPointConfig
@@ -137,8 +133,12 @@ A list of official Hugging Face and community (indicated by 🌎) resources to h
- preprocess
- post_process_keypoint_detection
+
+
## SuperPointForKeypointDetection
[[autodoc]] SuperPointForKeypointDetection
- forward
+
+