🚀 Introduction
Running DeepSeek R1 on a Pi 5 with 16 GB RAM feels like taking that same Pi 400 project from my February guide and super‑charging it. With more memory, faster CPU cores, and better headroom, we can use Open WebUI over Ollama, hook in RAG, and even add pipeline automations—all still local, all still low‑cost, all privacy‑first.

💡 Why Pi 5 (16 GB)?
Jeremy Morgan and others have largely confirmed what we know: Raspberry Pi 5 with 8 GB or 16 GB is capable of managing the deepseek‑r1:1.5b model smoothly, hitting around 6 tokens/sec and consuming ~3 GB RAM (kevsrobots.com, dev.to).
The extra memory gives breathing room for RAG, pipelines, and more.
🛠️ Prerequisites & Setup
-
OS: Raspberry Pi OS (64‑bit, Bookworm)
-
Hardware: Pi 5, 16 GB RAM, 32 GB+ microSD or SSD, wired or stable Wi‑Fi
-
Tools: Docker, Docker Compose, access to terminal
🧰 System prep
Install Docker & Compose:
Install Ollama (ARM64):
⚙️ Docker Compose: Ollama + Open WebUI
Create the stack folder:
Then create docker-compose.yaml:
Bring it online:
✅ Ollama runs on port 11434, Open WebUI on 3000.
📥 Installing DeepSeek R1 Model
In terminal:
In Open WebUI (visit http://<pi-ip>:3000):
-
🧑💻 Create your admin user
-
⚙️ Go to Settings → Models
-
➕ Pull
deepseek-r1:1.5bvia UI
Once added, it’s selectable from the top model dropdown.
💬 Basic Usage & Performance
Select deepseek-r1:1.5b, type your prompt:
→ Expect ~6 tokens/sec
→ ~3 GB RAM usage
→ CPU fully engaged
Perfectly usable for daily chats, documentation Q&A, and light pipelines.
📚 Adding RAG with Open WebUI
Open WebUI supports Retrieval‑Augmented Generation (RAG) out of the box.
Steps:
-
📄 Collect
.mdor.txtfiles (policies, notes, docs). -
➕ In UI: Workspace → Knowledge → + Create Knowledge Base, upload your docs.
-
🧠 Then: Workspace → Models → + Add New Model
-
Model name:
DeepSeek‑KB -
Base model:
deepseek-r1:1.5b -
Knowledge: select the knowledge base
-
The result? 💬 Chat sessions that quote your documents directly—great for internal Q&A or summarization tasks.
🧪 Pipeline Automations
This is where things get real fun. With Pipelines, Open WebUI becomes programmable.
🧱 Start the pipelines container:
Link it via WebUI Settings (URL: http://host.docker.internal:9099)
Now build workflows:
-
🔗 Chain prompts (e.g. translate → summarize → translate back)
-
🧹 Clean/filter input/output
-
⚙️ Trigger external actions (webhooks, APIs, home automation)
Write custom Python logic and integrate it as a processing step.
🧭 Example Use Cases
| 🧩 Scenario | 🛠️ Setup | ⚡ Pi 5 Experience |
|---|---|---|
| Enterprise FAQ assistant | Upload docs + RAG + KB model | Snappy, contextual answers |
| Personal notes chatbot | KB built from blog posts or .md files | Great for journaling, research |
| Automated translation | Pipeline: Translate → Run → Translate | Works with light latency |
📝 Tips & Gotchas
-
🧠 Stick with 1.5B models for usability.
-
📉 Monitor RAM and CPU; disable swap where possible.
-
🔒 Be cautious with pipeline code—no sandboxing.
-
🗂️ Use volume backups to persist state between upgrades.
🎯 Conclusion
Running DeepSeek R1 with Open WebUI, RAG, and Pipelines on a Pi 5 (16 GB) isn’t just viable—it’s powerful. You can create focused, contextual AI tools completely offline. You control the data. You own the results.
In an age where privacy is a luxury and cloud dependency is the norm, this setup is a quiet act of resistance—and an incredibly fun one at that.
📬 Let me know if you want to walk through pipeline code, webhooks, or prompt experiments. The Pi is small—but what it teaches us is huge.
* AI tools were used as a research assistant for this content, but human moderation and writing are also included. The included images are AI-generated.