From 0f6a1b24321eb1c04157c0d779e86c281cd21117 Mon Sep 17 00:00:00 2001 From: Drew Ross Date: Sat, 14 Jun 2025 22:37:13 -0500 Subject: [PATCH 1/5] Update PEGASUS-X model card --- docs/source/en/model_doc/pegasus_x.md | 134 +++++++++++++++++++++----- 1 file changed, 108 insertions(+), 26 deletions(-) diff --git a/docs/source/en/model_doc/pegasus_x.md b/docs/source/en/model_doc/pegasus_x.md index 379e0362bb70..717151847f67 100644 --- a/docs/source/en/model_doc/pegasus_x.md +++ b/docs/source/en/model_doc/pegasus_x.md @@ -14,35 +14,117 @@ rendered properly in your Markdown viewer. --> -# PEGASUS-X - -
-PyTorch -FlashAttention +
+
+ PyTorch + TensorFlow + Flax + FlashAttention +
-## Overview - -The PEGASUS-X model was proposed in [Investigating Efficiently Extending Transformers for Long Input Summarization](https://huggingface.co/papers/2208.04347) by Jason Phang, Yao Zhao and Peter J. Liu. - -PEGASUS-X (PEGASUS eXtended) extends the PEGASUS models for long input summarization through additional long input pretraining and using staggered block-local attention with global tokens in the encoder. - -The abstract from the paper is the following: - -*While large pretrained Transformer models have proven highly capable at tackling natural language tasks, handling long sequence inputs continues to be a significant challenge. One such task is long input summarization, where inputs are longer than the maximum input context of most pretrained models. Through an extensive set of experiments, we investigate what model architectural changes and pretraining paradigms can most efficiently adapt a pretrained Transformer for long input summarization. We find that a staggered, block-local Transformer with global encoder tokens strikes a good balance of performance and efficiency, and that an additional pretraining phase on long sequences meaningfully improves downstream summarization performance. Based on our findings, we introduce PEGASUS-X, an extension of the PEGASUS model with additional long input pretraining to handle inputs of up to 16K tokens. PEGASUS-X achieves strong performance on long input summarization tasks comparable with much larger models while adding few additional parameters and not requiring model parallelism to train.* - -This model was contributed by [zphang](https://huggingface.co/zphang). The original code can be found [here](https://github.com/google-research/pegasus). - -## Documentation resources - -- [Translation task guide](../tasks/translation) -- [Summarization task guide](../tasks/summarization) - - - -PEGASUS-X uses the same tokenizer as [PEGASUS](pegasus). +# PEGASUS-X - +[PEGASUS-X](https://huggingface.co/papers/2208.04347) is an encoder-decoder (sequence-to-sequence) transformer model for long-input summarization. It extends the [Pegasus](./pegasus) model with staggered block-local attention, global encoder tokens, and additional pretraining on long text sequences, enabling it to handle inputs of up to 16,000 tokens. PEGASUS-X matches the performance of much larger models while using fewer parameters. + +You can find all the original PEGASUS-X checkpoints under the [Google](https://huggingface.co/google?search_models=pegasus-x-) organization. + +> [!TIP] +> Click on the PEGASUS-X models in the right sidebar for more examples of how to apply PEGASUS-X to different language tasks. + +The example below demonstrates how to summarize text with [`Pipeline`], [`AutoModel`], and from the command line. + + + + +```py +import torch +from transformers import pipeline + +pipeline = pipeline( + task="summarization", + model="google/pegasus-x-large", + torch_dtype=torch.float32, + device=0 +) +pipeline("""Plants are among the most remarkable and essential life forms on Earth, possessing a unique ability to produce their own food through a process known as photosynthesis. This complex biochemical process is fundamental not only to plant life but to virtually all life on the planet. +Through photosynthesis, plants capture energy from sunlight using a green pigment called chlorophyll, which is located in specialized cell structures called chloroplasts. In the presence of light, plants absorb carbon dioxide from the atmosphere through small pores in their leaves called stomata, and take in water from the soil through their root systems. +These ingredients are then transformed into glucose, a type of sugar that serves as a source of chemical energy, and oxygen, which is released as a byproduct into the atmosphere. The glucose produced during photosynthesis is not just used immediately; plants also store it as starch or convert it into other organic compounds like cellulose, which is essential for building their cellular structure. +This energy reserve allows them to grow, develop leaves, produce flowers, bear fruit, and carry out various physiological processes throughout their lifecycle.""") +``` + + + +```py +import torch +from transformers import AutoTokenizer, AutoModelForSeq2SeqLM + +tokenizer = AutoTokenizer.from_pretrained( + "google/pegasus-x-large" +) +model = AutoModelForSeq2SeqLM.from_pretrained( + "google/pegasus-x-large", + torch_dtype=torch.float32, + device_map="auto", +) + +input_text = """Plants are among the most remarkable and essential life forms on Earth, possessing a unique ability to produce their own food through a process known as photosynthesis. This complex biochemical process is fundamental not only to plant life but to virtually all life on the planet. +Through photosynthesis, plants capture energy from sunlight using a green pigment called chlorophyll, which is located in specialized cell structures called chloroplasts. In the presence of light, plants absorb carbon dioxide from the atmosphere through small pores in their leaves called stomata, and take in water from the soil through their root systems. +These ingredients are then transformed into glucose, a type of sugar that serves as a source of chemical energy, and oxygen, which is released as a byproduct into the atmosphere. The glucose produced during photosynthesis is not just used immediately; plants also store it as starch or convert it into other organic compounds like cellulose, which is essential for building their cellular structure. +This energy reserve allows them to grow, develop leaves, produce flowers, bear fruit, and carry out various physiological processes throughout their lifecycle.""" +input_ids = tokenizer(input_text, return_tensors="pt").to("cuda") + +output = model.generate(**input_ids, cache_implementation="static") +print(tokenizer.decode(output[0], skip_special_tokens=True)) +``` + + + +```bash +echo -e "Plants are among the most remarkable and essential life forms on Earth, possessing a unique ability to produce their own food through a process known as photosynthesis. This complex biochemical process is fundamental not only to plant life but to virtually all life on the planet. Through photosynthesis, plants capture energy from sunlight using a green pigment called chlorophyll, which is located in specialized cell structures called chloroplasts. In the presence of light, plants absorb carbon dioxide from the atmosphere through small pores in their leaves called stomata, and take in water from the soil through their root systems. These ingredients are then transformed into glucose, a type of sugar that serves as a source of chemical energy, and oxygen, which is released as a byproduct into the atmosphere. The glucose produced during photosynthesis is not just used immediately; plants also store it as starch or convert it into other organic compounds like cellulose, which is essential for building their cellular structure. This energy reserve allows them to grow, develop leaves, produce flowers, bear fruit, and carry out various physiological processes throughout their lifecycle." | transformers-cli run --task summarization --model google/pegasus-x-large --device 0 +``` + + + +Quantization reduces the memory burden of large models by representing the weights in a lower precision. Refer to the [Quantization](../quantization/overview) overview for more available quantization backends. + +The example below uses [bitsandbytes](../quantization/bitsandbytes) to only quantize the weights to int4. + +```py +import torch +from transformers import BitsAndBytesConfig, AutoModelForSeq2SeqLM, AutoTokenizer + +quantization_config = BitsAndBytesConfig( + load_in_4bit=True, + bnb_4bit_compute_dtype=torch.bfloat16, + bnb_4bit_quant_type="nf4" +) +model = AutoModelForSeq2SeqLM.from_pretrained( + "google/pegasus-x-large", + torch_dtype=torch.bfloat16, + device_map="auto", + quantization_config=quantization_config +) + +tokenizer = AutoTokenizer.from_pretrained( + "google/pegasus-x-large" +) + +input_text = """Plants are among the most remarkable and essential life forms on Earth, possessing a unique ability to produce their own food through a process known as photosynthesis. This complex biochemical process is fundamental not only to plant life but to virtually all life on the planet. +Through photosynthesis, plants capture energy from sunlight using a green pigment called chlorophyll, which is located in specialized cell structures called chloroplasts. In the presence of light, plants absorb carbon dioxide from the atmosphere through small pores in their leaves called stomata, and take in water from the soil through their root systems. +These ingredients are then transformed into glucose, a type of sugar that serves as a source of chemical energy, and oxygen, which is released as a byproduct into the atmosphere. The glucose produced during photosynthesis is not just used immediately; plants also store it as starch or convert it into other organic compounds like cellulose, which is essential for building their cellular structure. +This energy reserve allows them to grow, develop leaves, produce flowers, bear fruit, and carry out various physiological processes throughout their lifecycle.""" +input_ids = tokenizer(input_text, return_tensors="pt").to("cuda") + +output = model.generate(**input_ids) +print(tokenizer.decode(output[0], skip_special_tokens=True)) +``` + +## Notes + +- PEGASUS-X does not support FP16. +- PEGASUS-X uses the same [tokenizer](pegasus#transformers.PegasusTokenizer) as Pegasus. ## PegasusXConfig From e3a53dfae09d8dce5a852fa52c8e5d049487620e Mon Sep 17 00:00:00 2001 From: Drew Ross Date: Sat, 14 Jun 2025 23:00:55 -0500 Subject: [PATCH 2/5] Add cache_implementation argument in quantization code example --- docs/source/en/model_doc/pegasus_x.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/source/en/model_doc/pegasus_x.md b/docs/source/en/model_doc/pegasus_x.md index 717151847f67..fce92121183a 100644 --- a/docs/source/en/model_doc/pegasus_x.md +++ b/docs/source/en/model_doc/pegasus_x.md @@ -117,7 +117,7 @@ These ingredients are then transformed into glucose, a type of sugar that serves This energy reserve allows them to grow, develop leaves, produce flowers, bear fruit, and carry out various physiological processes throughout their lifecycle.""" input_ids = tokenizer(input_text, return_tensors="pt").to("cuda") -output = model.generate(**input_ids) +output = model.generate(**input_ids, cache_implementation="static") print(tokenizer.decode(output[0], skip_special_tokens=True)) ``` From 8eed7590c3ca9ff9ea813b62ada3f6dbb4f4f77a Mon Sep 17 00:00:00 2001 From: Drew Ross Date: Sun, 22 Jun 2025 18:15:24 -0500 Subject: [PATCH 3/5] Update CLI example --- docs/source/en/model_doc/pegasus_x.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/source/en/model_doc/pegasus_x.md b/docs/source/en/model_doc/pegasus_x.md index fce92121183a..bf82b9499195 100644 --- a/docs/source/en/model_doc/pegasus_x.md +++ b/docs/source/en/model_doc/pegasus_x.md @@ -82,7 +82,7 @@ print(tokenizer.decode(output[0], skip_special_tokens=True)) ```bash -echo -e "Plants are among the most remarkable and essential life forms on Earth, possessing a unique ability to produce their own food through a process known as photosynthesis. This complex biochemical process is fundamental not only to plant life but to virtually all life on the planet. Through photosynthesis, plants capture energy from sunlight using a green pigment called chlorophyll, which is located in specialized cell structures called chloroplasts. In the presence of light, plants absorb carbon dioxide from the atmosphere through small pores in their leaves called stomata, and take in water from the soil through their root systems. These ingredients are then transformed into glucose, a type of sugar that serves as a source of chemical energy, and oxygen, which is released as a byproduct into the atmosphere. The glucose produced during photosynthesis is not just used immediately; plants also store it as starch or convert it into other organic compounds like cellulose, which is essential for building their cellular structure. This energy reserve allows them to grow, develop leaves, produce flowers, bear fruit, and carry out various physiological processes throughout their lifecycle." | transformers-cli run --task summarization --model google/pegasus-x-large --device 0 +echo -e "Plants are among the most remarkable and essential life forms on Earth, possessing a unique ability to produce their own food through a process known as photosynthesis. This complex biochemical process is fundamental not only to plant life but to virtually all life on the planet. Through photosynthesis, plants capture energy from sunlight using a green pigment called chlorophyll, which is located in specialized cell structures called chloroplasts." | transformers-cli run --task summarization --model google/pegasus-x-large --device 0 ``` From 4d51aa4ad3e3a0021456ec4d217cb4968d0a8ffd Mon Sep 17 00:00:00 2001 From: Drew Ross Date: Thu, 26 Jun 2025 14:33:10 -0500 Subject: [PATCH 4/5] Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --- docs/source/en/model_doc/pegasus_x.md | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/docs/source/en/model_doc/pegasus_x.md b/docs/source/en/model_doc/pegasus_x.md index bf82b9499195..021763b0ba5d 100644 --- a/docs/source/en/model_doc/pegasus_x.md +++ b/docs/source/en/model_doc/pegasus_x.md @@ -28,9 +28,11 @@ rendered properly in your Markdown viewer. [PEGASUS-X](https://huggingface.co/papers/2208.04347) is an encoder-decoder (sequence-to-sequence) transformer model for long-input summarization. It extends the [Pegasus](./pegasus) model with staggered block-local attention, global encoder tokens, and additional pretraining on long text sequences, enabling it to handle inputs of up to 16,000 tokens. PEGASUS-X matches the performance of much larger models while using fewer parameters. -You can find all the original PEGASUS-X checkpoints under the [Google](https://huggingface.co/google?search_models=pegasus-x-) organization. +You can find all the original PEGASUS-X checkpoints under the [Google](https://huggingface.co/google/models?search=pegasus-x) organization. > [!TIP] +> This model was contributed by [zphang](https://huggingface.co/zphang). +> > Click on the PEGASUS-X models in the right sidebar for more examples of how to apply PEGASUS-X to different language tasks. The example below demonstrates how to summarize text with [`Pipeline`], [`AutoModel`], and from the command line. @@ -45,7 +47,7 @@ from transformers import pipeline pipeline = pipeline( task="summarization", model="google/pegasus-x-large", - torch_dtype=torch.float32, + torch_dtype=torch.bfloat16, device=0 ) pipeline("""Plants are among the most remarkable and essential life forms on Earth, possessing a unique ability to produce their own food through a process known as photosynthesis. This complex biochemical process is fundamental not only to plant life but to virtually all life on the planet. @@ -65,7 +67,7 @@ tokenizer = AutoTokenizer.from_pretrained( ) model = AutoModelForSeq2SeqLM.from_pretrained( "google/pegasus-x-large", - torch_dtype=torch.float32, + torch_dtype=torch.bfloat16, device_map="auto", ) @@ -123,8 +125,7 @@ print(tokenizer.decode(output[0], skip_special_tokens=True)) ## Notes -- PEGASUS-X does not support FP16. -- PEGASUS-X uses the same [tokenizer](pegasus#transformers.PegasusTokenizer) as Pegasus. +- PEGASUS-X also uses the [`PegasusTokenizer`]. ## PegasusXConfig From 3561c65d4bdf9f0458b9bc37edd531dc724675cf Mon Sep 17 00:00:00 2001 From: Drew Ross Date: Thu, 26 Jun 2025 14:48:32 -0500 Subject: [PATCH 5/5] Remove TensorFlow and Flax badges --- docs/source/en/model_doc/pegasus_x.md | 3 --- 1 file changed, 3 deletions(-) diff --git a/docs/source/en/model_doc/pegasus_x.md b/docs/source/en/model_doc/pegasus_x.md index 021763b0ba5d..d581b2e9a38d 100644 --- a/docs/source/en/model_doc/pegasus_x.md +++ b/docs/source/en/model_doc/pegasus_x.md @@ -17,9 +17,6 @@ rendered properly in your Markdown viewer.
PyTorch - TensorFlow - Flax FlashAttention