Running Ollama on your Mac

Why Run AI Models Locally?

Running AI models locally on your machine offers several advantages:

Privacy: Your data never leaves your computer
No subscription fees or API costs
Control over model selection and parameters
Lower latency for many operations

System Requirements

macOS 12 or later
At least 8GB RAM (16GB recommended for larger models)
Sufficient disk space (models can weigh several GB)

Quick Installation

Installing Ollama on macOS is straightforward using Homebrew:

brew install ollama

Starting the Ollama Server

Launch the Ollama server by running:

ollama serve

Keep this terminal window open while using Ollama. The server needs to run in the background to handle model operations and requests.

Running Your First Model

To run a model, open a new terminal window and use:

ollama run llama2

This will download and run the Llama 2 model. The first run includes the download, which might take a few minutes depending on your internet connection.

Available Models

Ollama supports various models out of the box. You can list the available models:

ollama list

You can run specific models:

ollama run mistral   # Smaller, faster model
ollama run llama2    # Balanced performance
ollama run codellama # Specialized for code

Basic Usage Examples

Here are some common operations you can perform:

# Start a chat session
ollama run llama2

# Run with specific parameters
ollama run llama2 --temperature 0.7

# Generate code
ollama run codellama "Write a Python function to calculate fibonacci numbers"

# Process a file
cat your_file.txt | ollama run llama2 "Summarize this text"

Adding a web UI

Using Docker, you can easily run Open WebUI, a web interface used to interact with LLMs.

docker run -d -p 3000:8080 --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main

After a few tens of seconds, Open WebUI is available at localhost:3000.

Landing page