Add option to convert PyTriton response to OpenAI format#9726
Add option to convert PyTriton response to OpenAI format#9726oyilmaz-nvidia merged 3 commits intomainfrom
Conversation
Signed-off-by: Abhishree <abhishreetm@gmail.com>
Signed-off-by: Abhishree <abhishreetm@gmail.com>
fcff757 to
43db3ae
Compare
Signed-off-by: athitten <athitten@users.noreply.github.com>
|
|
||
| @app.get("/triton_health") | ||
| async def check_triton_health(): | ||
| """ |
There was a problem hiding this comment.
Added a test method check_triton_health for users to verify if Pytriton server is up and running since many times while using REST service, we encounter an error if Pytriton server is not accessible. This test method adds a check for that.
| ) | ||
| parser.add_argument( | ||
| "-trt", "--triton_request_timeout", default=60, type=int, help="Timeout in seconds for Triton server" | ||
| ) |
There was a problem hiding this comment.
Moved the args triton_request_timeout from nemo/service/deploy/config.json to the argparser of deploy_triton.py so that users can pass it. Also added openai_format_response as an user defined option while running deploy_triton.py with REST service True
There was a problem hiding this comment.
Also deleted config.json as triton IP, triton port, triton timeout args are passed by the user while running deploy_triton.py and they are stored in a json file when deploy_triton.py is run which are then accessed in rest_model_api.py via the stored config.json file.
|
|
||
| sentences = np.char.decode(output.astype("bytes"), "utf-8") | ||
| return sentences | ||
| if openai_format_response: |
There was a problem hiding this comment.
Do you think we can do this openai response conversion in the here https://github.com/NVIDIA/NeMo/blob/athitten/openai_format_response/nemo/deploy/service/rest_model_api.py#L101 ?
Also, is there any advantage of supporting the openai response type at the class level? Maybe our default should be the openai response format (still not sure though).
There was a problem hiding this comment.
@oyilmaz-nvidia yes we can do the openAI response conversion in rest_model_api.py after output is received here. Do you recommend that ?
Also regarding having OpenAI format as the default, do any of the methods/functions in the deploy module or any downstream tasks rely on output being returned directly (non OpenAI format) i.e the way it is now ? I wasn't sure of this so added it as an option to not break anything.
There was a problem hiding this comment.
So, this is an end user API and not sure who is using it how. So, adding OpenAI format as option makes sense. Let's leave it as it is.
* Option to convert response to OPenAI format Signed-off-by: Abhishree <abhishreetm@gmail.com> * Add OpenAI response arg and store_args_to_json method Signed-off-by: Abhishree <abhishreetm@gmail.com> * Apply isort and black reformatting Signed-off-by: athitten <athitten@users.noreply.github.com> --------- Signed-off-by: Abhishree <abhishreetm@gmail.com> Signed-off-by: athitten <athitten@users.noreply.github.com> Co-authored-by: athitten <athitten@users.noreply.github.com> Signed-off-by: Tugrul Konuk <ertkonuk@gmail.com>
…#9726) * Option to convert response to OPenAI format Signed-off-by: Abhishree <abhishreetm@gmail.com> * Add OpenAI response arg and store_args_to_json method Signed-off-by: Abhishree <abhishreetm@gmail.com> * Apply isort and black reformatting Signed-off-by: athitten <athitten@users.noreply.github.com> --------- Signed-off-by: Abhishree <abhishreetm@gmail.com> Signed-off-by: athitten <athitten@users.noreply.github.com> Co-authored-by: athitten <athitten@users.noreply.github.com>
* Option to convert response to OPenAI format Signed-off-by: Abhishree <abhishreetm@gmail.com> * Add OpenAI response arg and store_args_to_json method Signed-off-by: Abhishree <abhishreetm@gmail.com> * Apply isort and black reformatting Signed-off-by: athitten <athitten@users.noreply.github.com> --------- Signed-off-by: Abhishree <abhishreetm@gmail.com> Signed-off-by: athitten <athitten@users.noreply.github.com> Co-authored-by: athitten <athitten@users.noreply.github.com> Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
* Option to convert response to OPenAI format Signed-off-by: Abhishree <abhishreetm@gmail.com> * Add OpenAI response arg and store_args_to_json method Signed-off-by: Abhishree <abhishreetm@gmail.com> * Apply isort and black reformatting Signed-off-by: athitten <athitten@users.noreply.github.com> --------- Signed-off-by: Abhishree <abhishreetm@gmail.com> Signed-off-by: athitten <athitten@users.noreply.github.com> Co-authored-by: athitten <athitten@users.noreply.github.com>
…#9726) * Option to convert response to OPenAI format Signed-off-by: Abhishree <abhishreetm@gmail.com> * Add OpenAI response arg and store_args_to_json method Signed-off-by: Abhishree <abhishreetm@gmail.com> * Apply isort and black reformatting Signed-off-by: athitten <athitten@users.noreply.github.com> --------- Signed-off-by: Abhishree <abhishreetm@gmail.com> Signed-off-by: athitten <athitten@users.noreply.github.com> Co-authored-by: athitten <athitten@users.noreply.github.com> Signed-off-by: Hainan Xu <hainanx@nvidia.com>
…#9726) * Option to convert response to OPenAI format Signed-off-by: Abhishree <abhishreetm@gmail.com> * Add OpenAI response arg and store_args_to_json method Signed-off-by: Abhishree <abhishreetm@gmail.com> * Apply isort and black reformatting Signed-off-by: athitten <athitten@users.noreply.github.com> --------- Signed-off-by: Abhishree <abhishreetm@gmail.com> Signed-off-by: athitten <athitten@users.noreply.github.com> Co-authored-by: athitten <athitten@users.noreply.github.com>
What does this PR do ?
triton_request_timeoutandopenai_format_responseas user defined args which are then written to a json file along withtriton IPandtriton portto be accessed byrest_model_api.py(Deletes the existingnemo/deploy/service/config.jsonfavor of these changes)Collection: [Note which collection this PR will affect]
Changelog
Usage
# Add a code snippet demonstrating how to use thisGitHub Actions CI
The Jenkins CI system has been replaced by GitHub Actions self-hosted runners.
The GitHub Actions CI will run automatically when the "Run CICD" label is added to the PR.
To re-run CI remove and add the label again.
To run CI on an untrusted fork, a NeMo user with write access must first click "Approve and run".
Before your PR is "Ready for review"
Pre checks:
PR Type:
If you haven't finished some of the above items you can still open "Draft" PR.
Who can review?
Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.
Additional Information