-
Notifications
You must be signed in to change notification settings - Fork 3.3k
Description
- azure.ai.inference, azure.core.credentials:
- 1.0.0b7, 1.32.0:
- Windows 10:
- 3.12:
Describe the bug
When I specify an endpoint and model, as per the docs here, and send a request as below, I am getting a ClientAuthenticationError.
The following code:
MODEL_NAME = 'Llama-3.3-70B-Instruct'
client = ChatCompletionsClient(
endpoint = ENDPOINT,
credential = AzureKeyCredential(API_KEY)
)
response = client.complete(
messages=[
SystemMessage(content="You are a helpful assistant."),
UserMessage(content="Explain Riemann's conjecture in 1 paragraph"),
],
model = "Meta-Llama-3.1-405B-Instruct"
)Returns the following error:
ClientAuthenticationError: (None) Unauthorized. Access token is missing, invalid, audience is incorrect (https://cognitiveservices.azure.com), or have expired.
Code: None
Message: Unauthorized. Access token is missing, invalid, audience is incorrect (https://cognitiveservices.azure.com), or have expired.
To Reproduce
Steps to reproduce the behavior:
- Deploy a Llama 3.1 or 3.3 model to an Azure AI Services serverless endpoint
- Instantiate up a
ChatCompletionsClientobject as above - Send a request using
client.complete()as above
Expected behavior
I would like to just send and receive completions for simple requests/prompts.
I have previously had no issues sending requests to serverless Llama 3.1 models using the previous system, using both the ai.inferences package as well as raw HTTPS requests.
Screenshots
My endpoints page looks like this:
Additional context
- I specify the model name using the arg
modelwhen callingclient.complete(). For example, I have tried to test withLlama-3.3-70B-Instructas well asMeta-Llama-3.1-405B-Instruct. This was based on the docs - I've double checked for silly mistakes I can think of, e.g. incorrect endpoints (I have tried both with and without the
/modelsappended to the end of the Azure AI model inference endpoint. - The docs also indicate that my API key should be 32 characters, but I can confirm that the API key provided by the Azure AI model inference endpoint is longer than this.
- I've double checked for silly mistakes I can think of, e.g. incorrect endpoints (I have tried both with and without the
/modelsappended to the end of the Azure AI model inference endpoint. - I believe that I meet the version requirement for the
azure-ai-inferencepackage (1.0.0b5 per the docs). I can also confirm the following versions for these azure packages are installed in my venv:
| Package | Version |
|---|---|
| azure-ai-inference | 1.0.0b7 |
| azure-ai-ml | 1.24.0 |
| azure-common | 1.1.28 |
| azure-core | 1.32.0 |
| azure-core-tracing-opentelemetry | 1.0.0b11 |
| azure-identity | 1.19.0 |
| azure-mgmt-core | 1.5.0 |
| azure-monitor-opentelemetry | 1.6.4 |
| azure-monitor-opentelemetry-exporter | 1.0.0b33 |
| azure-storage-blob | 12.24.0 |
| azure-storage-file-datalake | 12.18.0 |
| azure-storage-file-share | 12.20.0 |
Apologies for inappropriately hijacking another issue earlier, I incorrectly thought it may have been relevant, until I realised the culprit of my 404 was the lack of /models in my URI.
Thanks for considering this issue, @dargilco.
