Quick Start

Get Ollama running with recommended models in under 5 minutes.

Prerequisites

Windows 10/11 (64-bit)
NVIDIA GPU with 12GB+ VRAM
NVIDIA Drivers (version 525+ for CUDA 12)
PowerShell 5.1+ (included with Windows)

Step 1: Clone the Repository

git clone https://github.com/christophacham/ollama-rtx-setup.git
cd ollama-rtx-setup

Step 2: Run Setup

.\setup-ollama.ps1

This script will:

Check for NVIDIA GPU and drivers
Install Ollama if not present
Download recommended models based on your VRAM

What Gets Installed

VRAM	Models Installed
32GB	qwen3:32b, deepseek-r1:32b, llama3.3:70b-instruct-q4_K_M
24GB	qwen3:32b, deepseek-r1:32b, qwen2.5:14b
12GB	qwen3:14b, qwen2.5:3b, qwen2.5-coder:14b

Web Search Optimized Stack (RTX 5090)

For web search use cases, the setup installs a specialized stack:

.\setup-ollama-websearch.ps1 -Setup OpenWebUI -SingleUser

Model	VRAM	Purpose
qwen2.5:3b	~4GB	Fast web queries
qwen2.5-coder:14b	~17GB	Synthesis and code

Total: ~21GB - both models fit in VRAM simultaneously, leaving ~11GB for context.

Step 3: Start Chatting

# Interactive chat
ollama run qwen3:32b

# Or use the API
curl http://localhost:11434/api/chat -d '{
  "model": "qwen3:32b",
  "messages": [{"role": "user", "content": "Hello!"}]
}'

Next Steps

Add Web Search - Enable AI-powered internet search
Choose Models - Find the best model for your task
Manage Models - Use the TUI for easy model management

Troubleshooting

"CUDA not available"

Ensure NVIDIA drivers are installed and up to date:

nvidia-smi

Models download slowly

Ollama downloads from their CDN. Large models (70B+) can take 30+ minutes on slower connections. Consider running downloads overnight.

Out of memory errors

Your GPU VRAM is full. Either:

Use a smaller quantization (q4 instead of q8)
Use a smaller model
Unload other models: ollama stop <model-name>

Prerequisites​

Step 1: Clone the Repository​

Step 2: Run Setup​

What Gets Installed​

Web Search Optimized Stack (RTX 5090)​

Step 3: Start Chatting​

Next Steps​

Troubleshooting​

"CUDA not available"​

Models download slowly​

Out of memory errors​