Ollama Cheat Sheet

Basic Commands

Command	Description
`ollama serve`	Start Ollama (if not running as a system service)
`ollama run <model>`	Run a model (and download it if it's not already installed)
`ollama pull <model>`	Pull/download a model without running it
`ollama rm <model>`	Remove/delete a model
`ollama list`	List all installed models
`ollama ps`	List models currently loaded in memory
`ollama show <model>`	Show details/information about a model
`ollama cp <src> <dest>`	Copy/rename a model

Model Size	Ollama Run Command	Notes
Gemma 2 2B	`ollama run gemma2:2b`	Perfect for background agents. Fast, lightweight (~1.5GB VRAM).
Gemma 2 9B	`ollama run gemma2:9b`	Excellent for RAG reasoning. High accuracy, requires ~6GB VRAM.
Gemma 2 27B	`ollama run gemma2:27b`	Heavy-duty reasoning. Requires ~18GB VRAM.

While inside ollama run <model>:

Create a file named Modelfile with:

dockerfile

FROM gemma4:e2b
PARAMETER temperature 0.7
SYSTEM You are a helpful programming assistant.

Then build it using: ollama create <custom-name> -f ./Modelfile