Mac Mini AI — Mac Mini Local LLM, Image Gen, STT on Apple Silicon OpenClaw Skill
Mac Mini AI — run LLMs, image generation, speech-to-text, and embeddings on your Mac Mini. M4 (16-32GB) and M4 Pro (24-64GB) configurations make the Mac Mini...
Installation
clawhub install mac-mini-ai
Requires npm i -g clawhub
38
Downloads
2
Stars
1
current installs
1 all-time
2
Versions
Power your OpenClaw skills with
the best open-source models.
Drop-in OpenAI-compatible API. No data leaves Europe.
Explore Inference APIGLM
GLM 5
$1.00 / $3.20
per M tokens
Kimi
Kimi K2.5
$0.60 / $2.80
per M tokens
MiniMax
MiniMax M2.5
$0.30 / $1.20
per M tokens
Qwen
Qwen3.5 122B
$0.40 / $3.00
per M tokens
Mac Mini AI — The $599 AI Node
The Mac Mini is the most cost-effective hardware for local AI. Starting at $599 with 16GB of unified memory, it runs 7B-14B models comfortably. Stack three Mac Minis for the cost of one month of cloud GPU rental — and they run forever with zero ongoing costs.
This skill turns one Mac Mini into an AI server and multiple Mac Minis into a fleet.
Mac Mini configurations for AI
| Config | Chip | Unified Memory | Price | LLM Sweet Spot |
|---|---|---|---|---|
| Mac Mini M4 (16GB) | M4 | 16GB | $599 | 3B-7B models (phi4-mini, llama3.2:3b) |
| Mac Mini M4 (24GB) | M4 | 24GB | $799 | 7B-14B models (phi4, gemma3:12b) |
| Mac Mini M4 (32GB) | M4 | 32GB | $999 | 14B-22B models (qwen3:14b, codestral) |
| Mac Mini M4 Pro (48GB) | M4 Pro | 48GB | $1,399 | 22B-32B models (qwen3:32b) |
| Mac Mini M4 Pro (64GB) | M4 Pro | 64GB | $1,799 | 32B-70B models (llama3.3:70b quantized) |
The Mac Mini fleet strategy
Three Mac Minis (32GB each) for $3,000 give you:
- 96GB total unified memory across the fleet
- Each runs a different model simultaneously
- The router picks the best device for every request
- $0/month after purchase — no cloud API costs
Mac Mini #1 (32GB) — llama3.3:70b (quantized) ─┐
Mac Mini #2 (32GB) — codestral + phi4 ├──→ Router ←── Your apps
Mac Mini #3 (32GB) — qwen3:14b + embeddings ─┘
Setup
pip install ollama-herd # PyPI: https://pypi.org/project/ollama-herd/
On one Mac Mini (the router):
herd
On every other Mac Mini:
herd-node
Devices discover each other automatically. No IP configuration, no Docker, no Kubernetes.
Use your Mac Mini
Chat with an LLM
from openai import OpenAI
client = OpenAI(base_url="http://localhost:11435/v1", api_key="not-needed")
response = client.chat.completions.create(
model="phi4",
messages=[{"role": "user", "content": "Write a Python web scraper"}],
stream=True,
)
for chunk in response:
print(chunk.choices[0].delta.content or "", end="")
Ollama API
curl http://localhost:11435/api/chat -d '{
"model": "gemma3:12b",
"messages": [{"role": "user", "content": "Explain recursion simply"}],
"stream": false
}'
Image generation (optional)
uv tool install mflux # Install on any Mac Mini
curl -o art.png http://localhost:11435/api/generate-image \
-H "Content-Type: application/json" \
-d '{"model": "z-image-turbo", "prompt": "a stack of Mac Minis glowing", "width": 512, "height": 512}'
Speech-to-text
curl http://localhost:11435/api/transcribe -F "file=@meeting.wav" -F "model=qwen3-asr"
Embeddings for RAG
curl http://localhost:11435/api/embed \
-d '{"model": "nomic-embed-text", "input": "Mac Mini home server local AI"}'
Best models for Mac Mini
| RAM | Best models | Why |
|---|---|---|
| 16GB | phi4-mini (3.8B), gemma3:4b, nomic-embed-text |
Small but capable, leaves room for OS |
| 24GB | phi4 (14B), gemma3:12b, codestral |
Sweet spot for single-model use |
| 32GB | qwen3:14b, deepseek-r1:14b, codestral + phi4-mini |
Two models simultaneously |
| 48GB | qwen3:32b, deepseek-r1:32b |
Larger models, great quality |
| 64GB | llama3.3:70b (quantized) |
Near-frontier quality on a Mac Mini |
Monitor your Mac Mini fleet
Dashboard at http://localhost:11435/dashboard — see every Mac Mini's status, loaded models, and queue depths.
# Fleet overview
curl -s http://localhost:11435/fleet/status | python3 -m json.tool
# Model recommendations for your hardware
curl -s http://localhost:11435/dashboard/api/recommendations | python3 -m json.tool
Works with any OpenAI-compatible tool
| Tool | Connection |
|---|---|
| Open WebUI | Ollama URL: http://mac-mini-ip:11435 |
| Aider | aider --openai-api-base http://mac-mini-ip:11435/v1 |
| Continue.dev | Base URL: http://mac-mini-ip:11435/v1 |
| LangChain | ChatOpenAI(base_url="http://mac-mini-ip:11435/v1") |
Full documentation
Contribute
Ollama Herd is open source (MIT). Built for the Mac Mini fleet community:
- Star on GitHub — help other Mac Mini owners find us
- Open an issue — share your Mac Mini fleet setup
- PRs welcome from humans and AI agents.
CLAUDE.mdgives full context. - Running a Mac Mini cluster? We'd love to hear about it.
Guardrails
- No automatic downloads — model pulls require explicit user confirmation.
- Model deletion requires explicit user confirmation.
- All requests stay local — no data leaves your network.
- Never delete or modify files in
~/.fleet-manager/.
Statistics
Author
Twin Geeks
@twinsgeeks
Latest Changes
v1.0.1 · Mar 31, 2026
No changes detected in this release. - Version number updated to 1.0.1, but all files remain unchanged.
Quick Install
clawhub install mac-mini-ai Related Skills
Other popular skills you might find useful.
Chat with 100+ AI Models in one App.
Use Claude, ChatGPT, Gemini alongside with EU-Hosted Models like Deepseek, GLM-5, Kimi K2.5 and many more.