Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 4 additions & 2 deletions .devcontainer/ollama/devcontainer.json
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,9 @@
"context": "../.."
},
"features": {
"ghcr.io/prulloac/devcontainer-features/ollama:1": {}
"ghcr.io/prulloac/devcontainer-features/ollama:1": {
"pull": "gemma4:e4b"
}
},
// Configure tool-specific properties.
"customizations": {
Expand All @@ -26,7 +28,7 @@
},

// Use 'postCreateCommand' to run commands after the container is created.
"postCreateCommand": "cp .env.sample.ollama .env && ollama pull llama3.1",
"postCreateCommand": "cp .env.sample.ollama .env",

// Comment out to connect as root instead. More info: https://aka.ms/vscode-remote/containers/non-root.
"remoteUser": "vscode",
Expand Down
2 changes: 1 addition & 1 deletion .env.sample
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ AZURE_OPENAI_ENDPOINT=https://YOUR-AZURE-OPENAI-SERVICE-NAME.openai.azure.com
AZURE_OPENAI_CHAT_DEPLOYMENT=YOUR-AZURE-DEPLOYMENT-NAME
# Needed for Ollama:
OLLAMA_ENDPOINT=http://localhost:11434/v1
OLLAMA_MODEL=llama3.1
OLLAMA_MODEL=gemma4:e4b
# Needed for OpenAI.com:
OPENAI_KEY=YOUR-OPENAI-KEY
OPENAI_MODEL=gpt-3.5-turbo
2 changes: 1 addition & 1 deletion .env.sample.ollama
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# See .env.sample for all options
API_HOST=ollama
OLLAMA_ENDPOINT=http://localhost:11434/v1
OLLAMA_MODEL=llama3.1
OLLAMA_MODEL=gemma4:e4b
28 changes: 22 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -142,27 +142,43 @@ This project includes infrastructure as code (IaC) to provision Azure OpenAI dep

## Using Ollama models

Most chat, streaming, function calling, structured outputs, CSV RAG, and document RAG flow samples work with local Ollama chat models. These samples have been tested with `gemma4:e4b` and `qwen3.5:4b`. The document ingestion and hybrid vector search samples currently use `text-embedding-3-small` for embeddings, so those scripts need Azure OpenAI/OpenAI embeddings or a code update before they can run in a local-only Ollama setup. The `reasoning.py` sample is intended for reasoning models, such as `gpt-oss`.

If you use GitHub Codespaces or Dev Containers, you can use the Ollama devcontainer, which installs Ollama and pulls the default model for you:

```text
https://codespaces.new/Azure-Samples/python-openai-demos?devcontainer_path=.devcontainer/ollama/devcontainer.json
```

1. Install [Ollama](https://ollama.com/) and follow the instructions to set it up on your local machine.
2. Pull a model, for example:
2. Pull the recommended model:

```shell
ollama pull gemma4:e4b
```

Another tested option is:

```shell
ollama pull llama3.1
ollama pull qwen3.5:4b
```

3. Create a `.env` file by copying the `.env.sample` file and updating it with your Ollama endpoint and model name.
3. Create a `.env` file by copying the Ollama-specific environment sample:

```bash
cp .env.sample .env
cp .env.sample.ollama .env
```

4. Update the `.env` file with your Ollama endpoint and model name (any model you've pulled):
4. Update the `.env` file with your Ollama endpoint and model name, if needed:

```bash
API_HOST=ollama
OLLAMA_ENDPOINT=http://localhost:11434/v1
OLLAMA_MODEL=llama3.1
OLLAMA_MODEL=gemma4:e4b
```

Use `http://localhost:11434/v1` when Ollama and Python run in the same environment, including the Ollama devcontainer. If Python runs in a different container and Ollama runs on the host machine, use `http://host.docker.internal:11434/v1` instead.

## Resources

* [Video series: Learn Python + AI (October 2025)](https://techcommunity.microsoft.com/blog/educatordeveloperblog/level-up-your-python--ai-skills-with-our-complete-series/4464546)
1 change: 1 addition & 0 deletions requirements-rag.txt
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
pymupdf4llm
lunr
sentence-transformers
tiktoken
14 changes: 11 additions & 3 deletions spanish/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -137,22 +137,30 @@ OPENAI_MODEL=gpt-4o-mini

### Usando modelos de Ollama

Instala [Ollama](https://ollama.com/) y descarga un modelo:
Instala [Ollama](https://ollama.com/) y descarga el modelo recomendado:

```bash
ollama pull llama3.1
ollama pull gemma4:e4b
```

Otra opcion probada es:

```bash
ollama pull qwen3.5:4b
```

Configura tu `.env`:

```bash
API_HOST=ollama
OLLAMA_ENDPOINT=http://localhost:11434/v1
OLLAMA_MODEL=llama3.1
OLLAMA_MODEL=gemma4:e4b
```

Si ejecutas dentro de un Dev Container, reemplaza `localhost` por `host.docker.internal`.

La mayoria de los ejemplos de chat, streaming, function calling, salidas estructuradas, RAG con CSV y flujo RAG con documentos funcionan con modelos de chat locales de Ollama. Los ejemplos de ingesta de documentos y busqueda vectorial hibrida actualmente usan `text-embedding-3-small` para embeddings, asi que esos scripts necesitan embeddings de Azure OpenAI/OpenAI o una actualizacion de codigo antes de poder ejecutarse en una configuracion local solo con Ollama.

## Recursos

* [Próxima serie octubre 2025: Python + IA](https://aka.ms/PythonIA/serie)
Expand Down