Skip to content

Add option to convert PyTriton response to OpenAI format#9726

Merged
oyilmaz-nvidia merged 3 commits intomainfrom
athitten/openai_format_response
Jul 18, 2024
Merged

Add option to convert PyTriton response to OpenAI format#9726
oyilmaz-nvidia merged 3 commits intomainfrom
athitten/openai_format_response

Conversation

@athitten
Copy link
Collaborator

@athitten athitten commented Jul 14, 2024

What does this PR do ?

  1. Adds option to convert responses from PyTriton server to OpenAI compatible format.
  2. Also adds triton_request_timeout and openai_format_response as user defined args which are then written to a json file along with triton IP and triton port to be accessed by rest_model_api.py (Deletes the existing nemo/deploy/service/config.json favor of these changes)

Collection: [Note which collection this PR will affect]

Changelog

  • Add specific line by line info of high level changes in this PR.

Usage

  • You can potentially add a usage example below
# Add a code snippet demonstrating how to use this 

GitHub Actions CI

The Jenkins CI system has been replaced by GitHub Actions self-hosted runners.

The GitHub Actions CI will run automatically when the "Run CICD" label is added to the PR.
To re-run CI remove and add the label again.
To run CI on an untrusted fork, a NeMo user with write access must first click "Approve and run".

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you add or update any necessary documentation?
  • Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
    • Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

  • New Feature
  • Bugfix
  • Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

  • Related to # (issue)

athitten added 2 commits July 16, 2024 16:29
Signed-off-by: Abhishree <abhishreetm@gmail.com>
Signed-off-by: Abhishree <abhishreetm@gmail.com>
@athitten athitten force-pushed the athitten/openai_format_response branch from fcff757 to 43db3ae Compare July 16, 2024 23:29
Signed-off-by: athitten <athitten@users.noreply.github.com>

@app.get("/triton_health")
async def check_triton_health():
"""
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a test method check_triton_health for users to verify if Pytriton server is up and running since many times while using REST service, we encounter an error if Pytriton server is not accessible. This test method adds a check for that.

)
parser.add_argument(
"-trt", "--triton_request_timeout", default=60, type=int, help="Timeout in seconds for Triton server"
)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved the args triton_request_timeout from nemo/service/deploy/config.json to the argparser of deploy_triton.py so that users can pass it. Also added openai_format_response as an user defined option while running deploy_triton.py with REST service True

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also deleted config.json as triton IP, triton port, triton timeout args are passed by the user while running deploy_triton.py and they are stored in a json file when deploy_triton.py is run which are then accessed in rest_model_api.py via the stored config.json file.

@athitten athitten marked this pull request as ready for review July 16, 2024 23:49
@athitten athitten changed the title Option to convert response to OpenAI format Add option to convert PyTriton response to OpenAI format Jul 16, 2024
@athitten athitten requested a review from oyilmaz-nvidia July 17, 2024 00:00

sentences = np.char.decode(output.astype("bytes"), "utf-8")
return sentences
if openai_format_response:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we can do this openai response conversion in the here https://github.com/NVIDIA/NeMo/blob/athitten/openai_format_response/nemo/deploy/service/rest_model_api.py#L101 ?

Also, is there any advantage of supporting the openai response type at the class level? Maybe our default should be the openai response format (still not sure though).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@oyilmaz-nvidia yes we can do the openAI response conversion in rest_model_api.py after output is received here. Do you recommend that ?

Also regarding having OpenAI format as the default, do any of the methods/functions in the deploy module or any downstream tasks rely on output being returned directly (non OpenAI format) i.e the way it is now ? I wasn't sure of this so added it as an option to not break anything.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, this is an end user API and not sure who is using it how. So, adding OpenAI format as option makes sense. Let's leave it as it is.

@athitten athitten changed the base branch from main to r2.0.0rc1 July 18, 2024 05:46
@athitten athitten changed the base branch from r2.0.0rc1 to main July 18, 2024 05:46
Copy link
Collaborator

@oyilmaz-nvidia oyilmaz-nvidia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@oyilmaz-nvidia oyilmaz-nvidia merged commit 234ac8b into main Jul 18, 2024
@oyilmaz-nvidia oyilmaz-nvidia deleted the athitten/openai_format_response branch July 18, 2024 15:57
ertkonuk pushed a commit that referenced this pull request Jul 19, 2024
* Option to convert response to OPenAI format

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add OpenAI response arg and store_args_to_json method

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Apply isort and black reformatting

Signed-off-by: athitten <athitten@users.noreply.github.com>

---------

Signed-off-by: Abhishree <abhishreetm@gmail.com>
Signed-off-by: athitten <athitten@users.noreply.github.com>
Co-authored-by: athitten <athitten@users.noreply.github.com>
Signed-off-by: Tugrul Konuk <ertkonuk@gmail.com>
tonyjie pushed a commit to tonyjie/NeMo that referenced this pull request Jul 24, 2024
…#9726)

* Option to convert response to OPenAI format

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add OpenAI response arg and store_args_to_json method

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Apply isort and black reformatting

Signed-off-by: athitten <athitten@users.noreply.github.com>

---------

Signed-off-by: Abhishree <abhishreetm@gmail.com>
Signed-off-by: athitten <athitten@users.noreply.github.com>
Co-authored-by: athitten <athitten@users.noreply.github.com>
akoumpa pushed a commit that referenced this pull request Jul 25, 2024
* Option to convert response to OPenAI format

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add OpenAI response arg and store_args_to_json method

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Apply isort and black reformatting

Signed-off-by: athitten <athitten@users.noreply.github.com>

---------

Signed-off-by: Abhishree <abhishreetm@gmail.com>
Signed-off-by: athitten <athitten@users.noreply.github.com>
Co-authored-by: athitten <athitten@users.noreply.github.com>
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
monica-sekoyan pushed a commit that referenced this pull request Oct 14, 2024
* Option to convert response to OPenAI format

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add OpenAI response arg and store_args_to_json method

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Apply isort and black reformatting

Signed-off-by: athitten <athitten@users.noreply.github.com>

---------

Signed-off-by: Abhishree <abhishreetm@gmail.com>
Signed-off-by: athitten <athitten@users.noreply.github.com>
Co-authored-by: athitten <athitten@users.noreply.github.com>
hainan-xv pushed a commit to hainan-xv/NeMo that referenced this pull request Nov 5, 2024
…#9726)

* Option to convert response to OPenAI format

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add OpenAI response arg and store_args_to_json method

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Apply isort and black reformatting

Signed-off-by: athitten <athitten@users.noreply.github.com>

---------

Signed-off-by: Abhishree <abhishreetm@gmail.com>
Signed-off-by: athitten <athitten@users.noreply.github.com>
Co-authored-by: athitten <athitten@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>
XuesongYang pushed a commit to paarthneekhara/NeMo that referenced this pull request Jan 18, 2025
…#9726)

* Option to convert response to OPenAI format

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add OpenAI response arg and store_args_to_json method

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Apply isort and black reformatting

Signed-off-by: athitten <athitten@users.noreply.github.com>

---------

Signed-off-by: Abhishree <abhishreetm@gmail.com>
Signed-off-by: athitten <athitten@users.noreply.github.com>
Co-authored-by: athitten <athitten@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments