[bot] Add Azure AI Inference Python SDK integration for ChatCompletionsClient and EmbeddingsClient instrumentation

## Summary

The Azure AI Inference Python SDK (`azure-ai-inference`) is Microsoft's official client for accessing AI models through Azure AI Foundry serverless endpoints, GitHub Models, managed compute endpoints, and the Azure OpenAI Service. It exposes `ChatCompletionsClient`, `EmbeddingsClient`, and `ImageEmbeddingsClient` for executing inference against a broad catalog of models (Meta Llama 3.3, Mistral Large, DeepSeek-R1, Microsoft Phi-4, Cohere Command R, and others) that are hosted on Azure but are **not** accessible through the standard OpenAI Python client.

This repository has zero instrumentation for any `azure-ai-inference` execution surface — no integration directory, no wrapper, no patcher, no `auto_instrument()` support. Users who call `ChatCompletionsClient.complete()` or `EmbeddingsClient.embed()` directly get no Braintrust spans.

The SDK cannot be wrapped with `wrap_openai()` because `ChatCompletionsClient` is a distinct class with its own authentication (Azure API keys or Entra ID) and its own request/response types from the `azure.ai.inference.models` namespace. `wrap_openai()` requires an `openai.OpenAI` instance.

The Braintrust docs list "Azure AI Foundry" as a supported cloud provider, but this coverage is provided through the AI Proxy gateway (using an OpenAI client pointed at the Braintrust gateway URL), not through native `azure-ai-inference` SDK tracing. Users who follow Microsoft's official documentation for Azure AI Foundry (`pip install azure-ai-inference`) and call `ChatCompletionsClient.complete()` directly get zero Braintrust spans.

## What needs to be instrumented

The `azure-ai-inference` package (v1.0.0b9) exposes these execution surfaces, none of which are instrumented:

### Chat completions (highest priority)

| SDK Method | Description | Streaming |
|---|---|---|
| `ChatCompletionsClient.complete(messages, ...)` | Chat completions via Azure AI Foundry / GitHub Models | No |
| `ChatCompletionsClient.complete(messages, stream=True, ...)` | Streaming chat completions | `StreamingChatCompletions` iterator |
| `AsyncChatCompletionsClient.complete(...)` | Async chat completions | No |
| `AsyncChatCompletionsClient.complete(..., stream=True)` | Async streaming chat completions | `AsyncStreamingChatCompletions` |

**Response shape:** `ChatCompletions` with `choices[0].message.content`, `choices[0].finish_reason`, `usage.prompt_tokens`, `usage.completion_tokens`, `usage.total_tokens`, `model`, `id`. Mirrors the OpenAI response shape in structure but is a distinct Azure type.

**Streaming:** `StreamingChatCompletions` is an iterable of `StreamingChatCompletionsUpdate` objects with `choices[0].delta.content`. The integration must accumulate deltas and finalize the span when iteration completes.

### Embeddings

| SDK Method | Description |
|---|---|
| `EmbeddingsClient.embed(input, ...)` | Generate embeddings for a list of texts |
| `AsyncEmbeddingsClient.embed(input, ...)` | Async embeddings |

**Return type:** `EmbeddingsResult` with `data[0].embedding` (list of floats) and `usage.prompt_tokens`.

## Implementation notes

**Authentication:** Uses Azure API key (`AzureKeyCredential`) or Entra ID (`DefaultAzureCredential`). VCR cassettes need `api-key` header sanitization.

**Endpoint-per-model pattern:** Unlike OpenAI where a single client accesses all models, each Azure AI Foundry deployment has its own endpoint URL. The model name is embedded in the endpoint or returned in the response. Span metadata should capture `model` from `ChatCompletions.model`.

**GitHub Models support:** The same `ChatCompletionsClient` is used for GitHub Models (with `endpoint="https://models.inference.ai.azure.com"` and a GitHub token). GitHub Models provides free-tier access to GPT-4o, Llama, Mistral, and others for prototyping.

**Parameters relevant for span metadata:** `model` (or inferred from endpoint), `temperature`, `max_tokens`, `top_p`, `frequency_penalty`, `presence_penalty`, `seed`, `tools`, `response_format`, `stop`.

## No coverage in any instrumentation layer

- No integration directory (`py/src/braintrust/integrations/azure_ai_inference/`)
- No wrapper function (e.g. `wrap_azure_ai_inference()`)
- No patcher in any existing integration
- No nox test session (`test_azure_ai_inference`)
- No version entry in `py/src/braintrust/integrations/versioning.py`
- No mention in `py/src/braintrust/integrations/__init__.py`

A grep for `azure.ai`, `azure-ai-inference`, or `azure_ai_inference` across `py/src/braintrust/` returns zero matches.

## Braintrust docs status

`unclear` — The [Braintrust AI providers page](https://www.braintrust.dev/docs/integrations/ai-providers) lists "Azure AI Foundry" as a supported cloud provider, but the integration is through the AI Proxy gateway (routing `openai.AzureOpenAI` or `openai.OpenAI` through the Braintrust gateway URL), not through a native `azure-ai-inference` SDK wrapper. Users following Microsoft's official `azure-ai-inference` quickstart docs get zero native Braintrust tracing.

## Upstream references

- azure-ai-inference on PyPI: https://pypi.org/project/azure-ai-inference/ (v1.0.0b9)
- azure-ai-inference on GitHub: https://github.com/Azure/azure-sdk-for-python/tree/main/sdk/ai/azure-ai-inference
- Azure AI Inference Python SDK docs: https://learn.microsoft.com/en-us/azure/ai-studio/reference/reference-model-inference-api
- ChatCompletionsClient reference: https://learn.microsoft.com/en-us/python/api/azure-ai-inference/azure.ai.inference.chatcompletionsclient
- GitHub Models quickstart: https://docs.github.com/en/github-models/use-github-models/getting-started-with-github-models
- Azure AI Foundry model catalog: https://ai.azure.com/explore/models

## Local repo files inspected

- `py/src/braintrust/integrations/` — no `azure_ai_inference/` directory on `main`
- `py/src/braintrust/wrappers/` — no Azure AI Inference wrapper
- `py/noxfile.py` — no `test_azure_ai_inference` session
- `py/pyproject.toml` `[tool.braintrust.matrix]` — no azure-ai-inference entry
- `py/src/braintrust/integrations/__init__.py` — Azure AI Inference not listed
- `py/src/braintrust/integrations/versioning.py` — no Azure AI Inference version matrix
- Full repo grep for `azure.ai`, `azure-ai-inference`, `azure_ai_inference` — zero matches in SDK source

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[bot] Add Azure AI Inference Python SDK integration for ChatCompletionsClient and EmbeddingsClient instrumentation #481

Summary

What needs to be instrumented

Chat completions (highest priority)

Embeddings

Implementation notes

No coverage in any instrumentation layer

Braintrust docs status

Upstream references

Local repo files inspected

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

SDK Method	Description	Streaming
`ChatCompletionsClient.complete(messages, ...)`	Chat completions via Azure AI Foundry / GitHub Models	No
`ChatCompletionsClient.complete(messages, stream=True, ...)`	Streaming chat completions	`StreamingChatCompletions` iterator
`AsyncChatCompletionsClient.complete(...)`	Async chat completions	No
`AsyncChatCompletionsClient.complete(..., stream=True)`	Async streaming chat completions	`AsyncStreamingChatCompletions`

SDK Method	Description
`EmbeddingsClient.embed(input, ...)`	Generate embeddings for a list of texts
`AsyncEmbeddingsClient.embed(input, ...)`	Async embeddings

[bot] Add Azure AI Inference Python SDK integration for ChatCompletionsClient and EmbeddingsClient instrumentation #481

Description

Summary

What needs to be instrumented

Chat completions (highest priority)

Embeddings

Implementation notes

No coverage in any instrumentation layer

Braintrust docs status

Upstream references

Local repo files inspected

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions