diff --git a/docs/source/en/model_doc/mt5.md b/docs/source/en/model_doc/mt5.md
index d6b9ef99cb66..2796c96eb8a0 100644
--- a/docs/source/en/model_doc/mt5.md
+++ b/docs/source/en/model_doc/mt5.md
@@ -14,54 +14,115 @@ rendered properly in your Markdown viewer.
-->
-# mT5
-
-
-

-

-

+
-## Overview
-
-The mT5 model was presented in [mT5: A massively multilingual pre-trained text-to-text transformer](https://huggingface.co/papers/2010.11934) by Linting Xue, Noah Constant, Adam Roberts, Mihir Kale, Rami Al-Rfou, Aditya
-Siddhant, Aditya Barua, Colin Raffel.
-
-The abstract from the paper is the following:
-
-*The recent "Text-to-Text Transfer Transformer" (T5) leveraged a unified text-to-text format and scale to attain
-state-of-the-art results on a wide variety of English-language NLP tasks. In this paper, we introduce mT5, a
-multilingual variant of T5 that was pre-trained on a new Common Crawl-based dataset covering 101 languages. We detail
-the design and modified training of mT5 and demonstrate its state-of-the-art performance on many multilingual
-benchmarks. We also describe a simple technique to prevent "accidental translation" in the zero-shot setting, where a
-generative model chooses to (partially) translate its prediction into the wrong language. All of the code and model
-checkpoints used in this work are publicly available.*
-
-Note: mT5 was only pre-trained on [mC4](https://huggingface.co/datasets/mc4) excluding any supervised training.
-Therefore, this model has to be fine-tuned before it is usable on a downstream task, unlike the original T5 model.
-Since mT5 was pre-trained unsupervisedly, there's no real advantage to using a task prefix during single-task
-fine-tuning. If you are doing multi-task fine-tuning, you should use a prefix.
-
-Google has released the following variants:
-
-- [google/mt5-small](https://huggingface.co/google/mt5-small)
-
-- [google/mt5-base](https://huggingface.co/google/mt5-base)
-
-- [google/mt5-large](https://huggingface.co/google/mt5-large)
-
-- [google/mt5-xl](https://huggingface.co/google/mt5-xl)
-
-- [google/mt5-xxl](https://huggingface.co/google/mt5-xxl).
-
-This model was contributed by [patrickvonplaten](https://huggingface.co/patrickvonplaten). The original code can be
-found [here](https://github.com/google-research/multilingual-t5).
-
-## Resources
+# mT5
-- [Translation task guide](../tasks/translation)
-- [Summarization task guide](../tasks/summarization)
+[mT5](https://huggingface.co/papers/2010.11934) is a multilingual variant of [T5](./t5), training on 101 languages. It also incorporates a new "accidental translation" technique to prevent the model from incorrectly translating predictions into the wrong language.
+
+You can find all the original [mT5] checkpoints under the [mT5](https://huggingface.co/collections/google/mt5-release-65005f1a520f8d7b4d039509) collection.
+
+> [!TIP]
+> This model was contributed by [patrickvonplaten](https://huggingface.co/patrickvonplaten).
+>
+> Click on the mT5 models in the right sidebar for more examples of how to apply mT5 to different language tasks.
+
+The example below demonstrates how to summarize text with [`Pipeline`], [`AutoModel`], and from the command line.
+
+
+
+
+```python
+import torch
+from transformers import pipeline
+
+pipeline = pipeline(
+ task="text2text-generation",
+ model="csebuetnlp/mT5_multilingual_XLSum",
+ torch_dtype=torch.float16,
+ device=0
+)
+pipeline("""Plants are remarkable organisms that produce their own food using a method called photosynthesis.
+This process involves converting sunlight, carbon dioxide, and water into glucose, which provides energy for growth.
+Plants play a crucial role in sustaining life on Earth by generating oxygen and serving as the foundation of most ecosystems.""")
+```
+
+
+
+
+```python
+import torch
+from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
+
+tokenizer = AutoTokenizer.from_pretrained(
+ "csebuetnlp/mT5_multilingual_XLSum"
+)
+model = AutoModelForSeq2SeqLM.from_pretrained(
+ "csebuetnlp/mT5_multilingual_XLSum",
+ torch_dtype=torch.float16,
+ device_map="auto",
+)
+
+input_text = """Plants are remarkable organisms that produce their own food using a method called photosynthesis.
+This process involves converting sunlight, carbon dioxide, and water into glucose, which provides energy for growth.
+Plants play a crucial role in sustaining life on Earth by generating oxygen and serving as the foundation of most ecosystems."""
+input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")
+
+output = model.generate(**input_ids, cache_implementation="static")
+print(tokenizer.decode(output[0], skip_special_tokens=True))
+```
+
+
+
+
+```bash
+echo -e "Plants are remarkable organisms that produce their own food using a method called photosynthesis." | transformers run --task text2text-generation --model csebuetnlp/mT5_multilingual_XLSum --device 0
+```
+
+
+
+
+Quantization reduces the memory burden of large models by representing the weights in a lower precision. Refer to the [Quantization](../quantization/overview) overview for more available quantization backends.
+
+The example below uses [bitsandbytes](../quantization/bitsandbytes) to only quantize the weights to int4.
+
+```python
+import torch
+from transformers import BitsAndBytesConfig, AutoModelForSeq2SeqLM, AutoTokenizer
+
+quantization_config = BitsAndBytesConfig(
+ load_in_4bit=True,
+ bnb_4bit_compute_dtype=torch.bfloat16,
+ bnb_4bit_quant_type="nf4"
+)
+model = AutoModelForSeq2SeqLM.from_pretrained(
+ "csebuetnlp/mT5_multilingual_XLSum",
+ torch_dtype=torch.bfloat16,
+ device_map="auto",
+ quantization_config=quantization_config
+)
+
+tokenizer = AutoTokenizer.from_pretrained(
+ "csebuetnlp/mT5_multilingual_XLSum"
+)
+input_text = """Plants are remarkable organisms that produce their own food using a method called photosynthesis.
+This process involves converting sunlight, carbon dioxide, and water into glucose, which provides energy for growth.
+Plants play a crucial role in sustaining life on Earth by generating oxygen and serving as the foundation of most ecosystems."""
+input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")
+
+output = model.generate(**input_ids, cache_implementation="static")
+print(tokenizer.decode(output[0], skip_special_tokens=True))
+```
+
+## Notes
+
+- mT5 must be fine-tuned for downstream tasks because it was only pretrained on the [mc4](https://huggingface.co/datasets/mc4) dataset.
## MT5Config