Getting DeepSeek R1 Running on Your Pi 5 (16 GB) with Open WebUI, RAG, and Pipelines

🚀 Introduction

Running DeepSeek R1 on a Pi 5 with 16 GB RAM feels like taking that same Pi 400 project from my February guide and super‑charging it. With more memory, faster CPU cores, and better headroom, we can use Open WebUI over Ollama, hook in RAG, and even add pipeline automations—all still local, all still low‑cost, all privacy‑first.

PiAI


💡 Why Pi 5 (16 GB)?

Jeremy Morgan and others have largely confirmed what we know: Raspberry Pi 5 with 8 GB or 16 GB is capable of managing the deepseek‑r1:1.5b model smoothly, hitting around 6 tokens/sec and consuming ~3 GB RAM (kevsrobots.comdev.to).

The extra memory gives breathing room for RAGpipelines, and more.


🛠️ Prerequisites & Setup

  • OS: Raspberry Pi OS (64‑bit, Bookworm)

  • Hardware: Pi 5, 16 GB RAM, 32 GB+ microSD or SSD, wired or stable Wi‑Fi

  • Tools: Docker, Docker Compose, access to terminal

🧰 System prep

bash
CopyEdit
sudo apt update && sudo apt upgrade -y
sudo apt install curl git

Install Docker & Compose:

bash
CopyEdit
curl -fsSL https://get.docker.com | sh
sudo usermod -aG docker $USER
newgrp docker

Install Ollama (ARM64):

bash
CopyEdit
curl -fsSL https://ollama.com/install.sh | sh
ollama --version

⚙️ Docker Compose: Ollama + Open WebUI

Create the stack folder:

bash
CopyEdit
sudo mkdir -p /opt/stacks/openwebui
cd /opt/stacks/openwebui

Then create docker-compose.yaml:

yaml
CopyEdit
services:
ollama:
image: ghcr.io/ollama/ollama:latest
volumes:
- ollama:/root/.ollama
ports:
- "11434:11434"
open-webui:
image: ghcr.io/open-webui/open-webui:ollama
container_name: open-webui
ports:
- "3000:8080"
volumes:
- openwebui_data:/app/backend/data
restart: unless-stopped

volumes:
ollama:
openwebui_data:

Bring it online:

bash
CopyEdit
docker compose up -d

✅ Ollama runs on port 11434Open WebUI on 3000.


📥 Installing DeepSeek R1 Model

In terminal:

bash
CopyEdit
ollama pull deepseek-r1:1.5b

In Open WebUI (visit http://<pi-ip>:3000):

  1. 🧑‍💻 Create your admin user

  2. ⚙️ Go to Settings → Models

  3. ➕ Pull deepseek-r1:1.5b via UI

Once added, it’s selectable from the top model dropdown.


💬 Basic Usage & Performance

Select deepseek-r1:1.5b, type your prompt:

→ Expect ~6 tokens/sec
→ ~3 GB RAM usage
→ CPU fully engaged

Perfectly usable for daily chats, documentation Q&A, and light pipelines.


📚 Adding RAG with Open WebUI

Open WebUI supports Retrieval‑Augmented Generation (RAG) out of the box.

Steps:

  1. 📄 Collect .md or .txt files (policies, notes, docs).

  2. ➕ In UI: Workspace → Knowledge → + Create Knowledge Base, upload your docs.

  3. 🧠 Then: Workspace → Models → + Add New Model

    • Model name: DeepSeek‑KB

    • Base model: deepseek-r1:1.5b

    • Knowledge: select the knowledge base

The result? 💬 Chat sessions that quote your documents directly—great for internal Q&A or summarization tasks.


🧪 Pipeline Automations

This is where things get real fun. With Pipelines, Open WebUI becomes programmable.

🧱 Start the pipelines container:

bash
CopyEdit
docker run -d -p 9099:9099 \
--add-host=host.docker.internal:host-gateway \
-v pipelines:/app/pipelines \
--name pipelines ghcr.io/open-webui/pipelines:main

Link it via WebUI Settings (URL: http://host.docker.internal:9099)

Now build workflows:

  • 🔗 Chain prompts (e.g. translate → summarize → translate back)

  • 🧹 Clean/filter input/output

  • ⚙️ Trigger external actions (webhooks, APIs, home automation)

Write custom Python logic and integrate it as a processing step.


🧭 Example Use Cases

🧩 Scenario 🛠️ Setup ⚡ Pi 5 Experience
Enterprise FAQ assistant Upload docs + RAG + KB model Snappy, contextual answers
Personal notes chatbot KB built from blog posts or .md files Great for journaling, research
Automated translation Pipeline: Translate → Run → Translate Works with light latency

📝 Tips & Gotchas

  • 🧠 Stick with 1.5B models for usability.

  • 📉 Monitor RAM and CPU; disable swap where possible.

  • 🔒 Be cautious with pipeline code—no sandboxing.

  • 🗂️ Use volume backups to persist state between upgrades.


🎯 Conclusion

Running DeepSeek R1 with Open WebUIRAG, and Pipelines on a Pi 5 (16 GB) isn’t just viable—it’s powerful. You can create focused, contextual AI tools completely offline. You control the data. You own the results.

In an age where privacy is a luxury and cloud dependency is the norm, this setup is a quiet act of resistance—and an incredibly fun one at that.

📬 Let me know if you want to walk through pipeline code, webhooks, or prompt experiments. The Pi is small—but what it teaches us is huge.

 

 

* AI tools were used as a research assistant for this content, but human moderation and writing are also included. The included images are AI-generated.

Leave a comment