Refactor phi doc#37583
Conversation
|
Hi 👋, thank you for opening this pull request! The pull request is converted to draft by default. The CI will be paused while the PR is in draft mode. When it is ready for review, please click the |
| - This model is quite similar to `Llama` with the main difference in `PhiDecoderLayer`, where they used `PhiAttention` and `PhiMLP` layers in parallel configuration. | ||
|
|
||
| ## PhiConfig | ||
| - The tokenizer used for this model is identical to the [CodeGenTokenizer](https://huggingface.co/docs/transformers/v4.51.3/en/model_doc/codegen#transformers.CodeGenTokenizer). |
There was a problem hiding this comment.
I think we can remove these notes since they aren't directly related to usage and not that helpful. But we should make a note of using the model if you're using an older version of Transformers:
-
If you're using Transformers < 4.37.0.dev, set
trust_remote_code=Truein [~AutoModel.from_pretrained]. Otherwise, make sure you update Transformers to the latest stable version.code snippet demonstrating it
| ## PhiConfig | ||
| - The tokenizer used for this model is identical to the [CodeGenTokenizer](https://huggingface.co/docs/transformers/v4.51.3/en/model_doc/codegen#transformers.CodeGenTokenizer). | ||
|
|
||
| ## PhiConfig |
There was a problem hiding this comment.
Don't need to make any changes to the methods documented here
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
|
Thanks for your return, I will work on it. |
stevhliu
left a comment
There was a problem hiding this comment.
Thanks again, just a few more changes and then we can merge!
|
|
||
| [[autodoc]] PhiModel | ||
| - forward | ||
| [[autodoc]] PhiModel - forward |
There was a problem hiding this comment.
You can revert the changes to the [[autodoc]] here, they don't need to be reformatted
There was a problem hiding this comment.
I pushed a new update, not sure if I understood correclty your last comment concerning the [[autodoc]], but I reverted the changes anw. Thanks for you guidance ! This is will be my first contribution to an open source project :)
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
stevhliu
left a comment
There was a problem hiding this comment.
Nice and thanks for your choosing Transformers as your first project to contribute to! 🤗
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Added documentation for phi model * Update phi.md * Update phi.md * Update phi.md * Update docs/source/en/model_doc/phi.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/phi.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/phi.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/phi.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Updated model card * Update phi.md * Update phi.md * Update phi.md * Update docs/source/en/model_doc/phi.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Jihad <jihadhammoud_@hotmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Refactored the phi.md model card to follow the Hugging Face documentation conventions for model docs.
Reorganized the structure for consistency with other model docs (title, model overview, usage, and notes).
Included usage examples for:
pipeline (text generation)
AutoModel (text generation with AutoModelForCausal)
transformers-cli (text classification, to diversify usage)
Added general notes about the architecture and tokenizer.
What was left out:
The original documentation included:
Usage instructions for Phi-2
Flash Attention 2 integration tips
A performance graph
I didn’t include those in this version because there wasn’t a clear conventional place for them in model docs. I focused instead on making the file compliant with existing documentation structure.
If you'd like me to include the extra content from the original docs (e.g., FlashAttention setup, performance benchmarks, or Phi-2 usage), I’d be happy to revisit and integrate it in a way that fits well with the current doc style.