Skip to content

Ollama Cheat Sheet

Basic Commands

CommandDescription
ollama serveStart Ollama (if not running as a system service)
ollama run <model>Run a model (and download it if it's not already installed)
ollama pull <model>Pull/download a model without running it
ollama rm <model>Remove/delete a model
ollama listList all installed models
ollama psList models currently loaded in memory
ollama show <model>Show details/information about a model
ollama cp <src> <dest>Copy/rename a model

Gemma 2 Models (Best for Local Agents)

Model SizeOllama Run CommandNotes
Gemma 2 2Bollama run gemma2:2bPerfect for background agents. Fast, lightweight (~1.5GB VRAM).
Gemma 2 9Bollama run gemma2:9bExcellent for RAG reasoning. High accuracy, requires ~6GB VRAM.
Gemma 2 27Bollama run gemma2:27bHeavy-duty reasoning. Requires ~18GB VRAM.

Interactive Prompt Commands

While inside ollama run <model>:

  • /bye - Exit the prompt
  • /? - Show help
  • /set system <prompt> - Set a system prompt
  • /set parameter temperature <value> - Adjust creativity (e.g., 0.7)

Creating Custom Models (Modelfile)

Create a file named Modelfile with:

dockerfile
FROM gemma4:e2b
PARAMETER temperature 0.7
SYSTEM You are a helpful programming assistant.

Then build it using: ollama create <custom-name> -f ./Modelfile