Skip to content

Conversation

@Jan-Kazlouski-elastic
Copy link
Contributor

This PR adds changes to specification caused by elastic/elasticsearch#132388

Additional actions

  • Signed the CLA
  • Executed make contrib

@github-actions
Copy link
Contributor

github-actions bot commented Dec 5, 2025

Following you can find the validation changes against the target branch for the API.

API Status Request Response
inference.put_nvidia ➕ ⚪ Missing test Missing test

You can validate this API yourself by using the make validate target.

# Conflicts:
#	output/openapi/elasticsearch-openapi.json
#	output/openapi/elasticsearch-serverless-openapi.json
#	output/schema/schema.json
package.json Outdated
},
"dependencies": {
"@redocly/cli": "^1.34.5"
"@redocly/cli": "^1.34.6"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this should be getting changed here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is done by make setup & make contrib execution. It is mandatory to execute that before merging. Changing it to ^1.34.5 results in it going back to ^1.34.6 as soon as make setup is called.

Comment on lines 23 to 26
"rerank",
"text_embedding",
"completion",
"chat_completion"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick, but could these be in alphabetical order?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

*/
model_id: string
/**
* For a `text_embedding` task, the maximum number of tokens per input before chunking occurs.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be "For a `text_embedding` task, the maximum number of tokens per input. Inputs exceeding this value are truncated prior to sending to the Nvidia API."

This is wrong almost everywhere in the docs; there's an issue describing some of the problems with max_input_tokens.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Comment on lines 1849 to 1852
text_embedding,
completion,
chat_completion,
rerank
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency, could these be in alphabetical order?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

*/
input_type?: NvidiaInputType
/**
* For a `text_embedding` task, the method to handle inputs longer than the maximum token length.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To help differentiate this from max_input_tokens it might be better to word it like "the method used by the Nvidia model to handle inputs longer than..."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good thinking. Fixed.

Comment on lines 1818 to 1820
/**
* The URL of the Nvidia model endpoint.
*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be helpful to include the default URLs for each task type if url isn't specified?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. Added.

Comment on lines 144 to 147
text_embedding,
chat_completion,
completion,
rerank
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency, could these be in alphabetical order?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@Jan-Kazlouski-elastic
Copy link
Contributor Author

Hi @DonalEvans
Thank you for your review. Comments are addressed. Could you please give this PR another look?

mistral
}

export class NvidiaServiceSettings {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should also be a dimensions field documented for the text_embedding task, sorry for missing this in the first review.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should declare the parameters the user can't specify.
According to your suggestion from this comment - dimensions cannot be set during the creation of an endpoint and are not sent to the Nvidia

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Urgh, my mistake, sorry for forgetting about that, I've been up to my eyeballs in the Jina text embedding code lately (which does support specifying dimensions) and the context switch tripped me up.

* * `ingest`: Mapped to Nvidia's `passage` value in request. Used when generating embeddings during indexing.
* * `search`: Mapped to Nvidia's `query` value in request. Used when generating embeddings during querying.
*
* IMPORTANT: If not specified `input_type` field in request to Nvidia endpoint is set as `query` by default.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be better as "For Nvidia endpoints, if the `input_type` field is not specified, it defaults to `query`."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

*/
max_input_tokens?: integer
/**
* For a `text_embedding` task, the similarity measure. One of cosine, dot_product, l2_norm.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to add that if no similarity measure is specified, the default value is dot_product.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

# Conflicts:
#	output/openapi/elasticsearch-openapi.json
#	output/openapi/elasticsearch-serverless-openapi.json
#	output/schema/schema.json
@Jan-Kazlouski-elastic Jan-Kazlouski-elastic enabled auto-merge (squash) December 16, 2025 20:16
@Jan-Kazlouski-elastic Jan-Kazlouski-elastic merged commit e5d156f into main Dec 16, 2025
9 checks passed
@Jan-Kazlouski-elastic Jan-Kazlouski-elastic deleted the nvidia-integration branch December 16, 2025 20:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ml skip-backport This pull request should not be backported specification Team:ML

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants