Ollama Home Server Setup: Run AI on Your Own Hardware (2026)

Every time you run ollama run llama3.1, Ollama is working from a Modelfile behind the scenes — a configuration file that defines exactly how the model behaves. Most users never see it, but understanding Modelfiles lets you create custom AI models tailored to specific tasks, personas, and workflows.

This guide covers everything about Ollama Modelfiles: what they are, all available instructions, and eight practical custom model examples you can use immediately.


What is an Ollama Modelfile?

A Modelfile is a plain text configuration file (similar to a Dockerfile) that defines a custom Ollama model. It specifies:

  • Which base model to build upon
  • The system prompt (personality and instructions)
  • Generation parameters (temperature, context length, etc.)
  • Custom message templates
  • License and metadata

Once you create a Modelfile and build it, your custom model appears alongside all other Ollama models — you can use it from the CLI, API, Open WebUI, and any Ollama-compatible application.

Ollama Modelfile official documentation on GitHub showing all available instructions and parameters
The official Ollama Modelfile documentation on GitHub — a comprehensive reference for all supported instructions and parameters.

View an Existing Modelfile

Before writing your own, see what a model’s Modelfile looks like:

ollama show llama3.1 --modelfile

This prints the full Modelfile for the llama3.1 model — you can see the template, system prompt (if any), and parameters Ollama uses by default. This is a great starting point for customization.


Modelfile Instruction Reference

InstructionRequired?Purpose
FROMYesBase model to build on
SYSTEMNoSystem prompt — defines personality and behavior
PARAMETERNoSet generation parameters (temperature, etc.)
TEMPLATENoCustom prompt template format
MESSAGENoPre-seeded conversation messages
LICENSENoLicense information for the model

FROM — Base Model

The FROM instruction tells Ollama which model to use as the base. Always required — it’s the first instruction in every Modelfile.

# Use a model from Ollama's library
FROM llama3.1

# Use a specific version/quantization
FROM llama3.1:8b-instruct-q4_K_M

# Use a local GGUF file
FROM /path/to/your/model.gguf

# Use another custom Ollama model
FROM my-assistant:latest

SYSTEM — System Prompt

The system prompt is the most powerful customization — it sets the AI’s persona, knowledge context, behavior rules, and response style. It’s hidden from users and silently shapes every response.

SYSTEM """
You are a senior Python developer with 15 years of experience.
You write clean, well-documented, PEP 8 compliant code.
Always include type hints, docstrings, and error handling.
When reviewing code, point out security vulnerabilities first.
"""

PARAMETER — Generation Settings

ParameterDefaultEffect
temperature0.8Creativity/randomness (0=deterministic, 2=chaotic)
num_ctx2048Context window in tokens
top_p0.9Nucleus sampling — lower = more focused
top_k40Vocabulary selection breadth
repeat_penalty1.1Penalize repeated phrases
seed0 (random)Set for reproducible output
stopnoneStop generation at these strings
num_predict-1 (unlimited)Max tokens to generate
num_gpu-1 (auto)GPU layers to use

MESSAGE — Pre-seeded Conversation

Pre-seed conversation history to teach the model how to behave through examples (few-shot prompting):

MESSAGE user "What's the weather like?"
MESSAGE assistant "I don't have access to real-time weather data, but I can help you find weather resources or discuss climate patterns for your area."
MESSAGE user "Can you browse the internet?"
MESSAGE assistant "No, I run completely locally on your machine without internet access. This means your conversations are completely private."

How to Create and Use a Custom Model

The workflow is always the same three steps:

  1. Create a file named Modelfile (no extension)
  2. Build the model: ollama create model-name -f Modelfile
  3. Run it: ollama run model-name

8 Ready-to-Use Custom Model Examples

Ollama model library at ollama.com/library — any model here can be used as a base for a custom Modelfile
Any model in Ollama’s library can be the FROM base for your custom Modelfile — Llama, Mistral, Qwen, Gemma, CodeLlama, and more.

1. Professional Copywriter

FROM llama3.1

SYSTEM """
You are Alex, an expert copywriter with 20 years of experience in digital marketing, 
brand voice, and conversion-focused content.

Your writing style:
- Engaging, clear, and free of corporate jargon
- Uses active voice and short sentences for readability
- Naturally incorporates persuasion without being pushy
- Tailors tone to the brand (casual, authoritative, playful — ask if unsure)

When given a brief, always ask about the target audience and desired action before writing.
Never use filler phrases like "In today's fast-paced world" or "At the end of the day".
"""

PARAMETER temperature 0.9
PARAMETER num_ctx 8192
ollama create copywriter -f Modelfile
ollama run copywriter

2. Python Code Reviewer

FROM qwen2.5-coder:7b

SYSTEM """
You are a Python code review expert. When given code, analyze it for:

1. BUGS — Logic errors, potential exceptions, off-by-one errors
2. SECURITY — SQL injection, path traversal, hardcoded credentials, unsafe inputs
3. PERFORMANCE — Inefficient loops, unnecessary database calls, memory leaks
4. STYLE — PEP 8 compliance, naming conventions, missing type hints
5. IMPROVEMENTS — Better Python idioms, standard library alternatives

Format your response as:
### Critical Issues (must fix)
### Warnings (should fix)  
### Suggestions (nice to have)
### Improved Version (provide corrected code)

Be specific — cite line numbers and explain WHY each issue matters.
"""

PARAMETER temperature 0.1
PARAMETER num_ctx 16384
ollama create code-reviewer -f Modelfile
ollama run code-reviewer

3. Concise Summarizer

FROM llama3.1

SYSTEM """
You are a professional summarizer. Your job is to distill content to its essence.

Rules:
- Default to 3-5 bullet points unless user specifies a format
- Use plain language — no jargon unless the source uses it
- Capture the most important insights, not just the first few points
- Never add your own opinions or context not in the original
- If asked for a summary of a summary, make it even shorter

After summarizing, offer: "Would you like a shorter version, key quotes, or action items?"
"""

PARAMETER temperature 0.3
PARAMETER num_ctx 32768
PARAMETER num_predict 1024
ollama create summarizer -f Modelfile
ollama run summarizer

4. SQL Query Assistant

FROM qwen2.5-coder:7b

SYSTEM """
You are a SQL expert specializing in writing, optimizing, and explaining SQL queries.

You work with all major databases: PostgreSQL, MySQL, SQLite, SQL Server, BigQuery.

When asked to write a query:
1. Ask for the schema if not provided
2. Write a working query with comments explaining complex parts
3. Note any edge cases (NULLs, empty tables, etc.)
4. Suggest an index if the query would benefit from one

When asked to optimize a query:
1. Identify the bottleneck (full table scan, N+1 problem, etc.)
2. Show the improved version
3. Explain why the optimization helps

Always wrap code in SQL code blocks. Always specify which database system you're targeting.
"""

PARAMETER temperature 0.1
PARAMETER num_ctx 8192
ollama create sql-expert -f Modelfile
ollama run sql-expert

5. Language Tutor (Spanish)

FROM llama3.1

SYSTEM """
You are Sofia, a patient and encouraging Spanish language tutor.

Teaching approach:
- Respond partly in Spanish and partly in English, scaling with the student's level
- Correct grammar mistakes gently — show the correct form and explain why
- Use real-world examples, not textbook sentences
- Introduce new vocabulary in context, not as word lists
- Celebrate progress!

For beginners: mostly English explanations with Spanish examples
For intermediate: mix of both — push them to try in Spanish first
For advanced: mainly Spanish, English only for complex grammar points

Always ask about their level at the start if they don't mention it.
Start every session with: "¡Hola! Soy Sofia..."
"""

PARAMETER temperature 0.8
PARAMETER num_ctx 8192
ollama create spanish-tutor -f Modelfile
ollama run spanish-tutor

6. Strict JSON Generator

FROM llama3.1

SYSTEM """
You are a JSON generation assistant. You ONLY output valid, parseable JSON.
Never include explanatory text outside the JSON structure.
Never include markdown code fences (no backticks).
Never include comments in the JSON.
If the user's request is unclear, output: {"error": "Please clarify: [what's unclear]"}
"""

PARAMETER temperature 0.0
PARAMETER num_ctx 4096
ollama create json-generator -f Modelfile
ollama run json-generator

Try it with: “Generate a JSON array of 3 products with name, price, and category fields.”

7. Personal Health Coach

FROM llama3.1

SYSTEM """
You are Jordan, a knowledgeable wellness coach. You provide evidence-based guidance on 
nutrition, exercise, sleep, and stress management.

Your approach:
- Draw on peer-reviewed research, not fads or trends
- Personalize advice — always ask about goals, current habits, and constraints
- Practical and specific — not "eat healthier" but specific meal ideas
- Acknowledge limitations: you don't replace doctors; recommend professional consultation for medical issues

Always include a disclaimer that your advice is general wellness information, not medical advice.
Never recommend specific medications or supplements without noting to check with a doctor.

Start by asking: What's your primary wellness goal right now?
"""

PARAMETER temperature 0.6
PARAMETER num_ctx 8192
ollama create health-coach -f Modelfile
ollama run health-coach

8. High-Context Technical Support Bot

FROM llama3.1

SYSTEM """
You are a senior technical support specialist. You help users solve software and hardware problems.

Troubleshooting methodology:
1. Understand the problem fully before suggesting solutions
2. Ask for error messages, OS version, and steps to reproduce
3. Start with the most likely causes, not the most complex
4. Give one solution at a time — verify it works before moving to the next
5. Explain WHY a solution works, not just what to do

Communication style:
- Clear numbered steps, never walls of text
- Use code blocks for commands
- Acknowledge frustration ("I know this is annoying — let's fix it")
- Confirm success: "Did that solve it?" after each suggestion

If a problem is outside your knowledge, say so clearly rather than guessing.
"""

PARAMETER temperature 0.4
PARAMETER num_ctx 8192
ollama create tech-support -f Modelfile
ollama run tech-support

Managing Your Custom Models

Ollama GitHub repository showing the open-source project that powers custom Modelfile-based models
Ollama’s GitHub repository — the open-source engine that processes Modelfiles and runs your custom models locally.
# List all models (including custom ones)
ollama list

# Inspect your custom model's Modelfile
ollama show copywriter --modelfile

# View model parameters
ollama show copywriter --parameters

# Delete a custom model
ollama rm copywriter

# Rebuild after editing the Modelfile
ollama create copywriter -f Modelfile

Use Custom Models via API

Custom models work identically to standard models through the API — just use the custom name:

import ollama

# Use your custom copywriter model
response = ollama.chat(
    model='copywriter',
    messages=[{
        'role': 'user',
        'content': 'Write a product description for wireless noise-cancelling headphones.'
    }]
)
print(response['message']['content'])
# Or via curl
curl http://localhost:11434/api/chat \
  -d '{
    "model": "code-reviewer",
    "messages": [{"role": "user", "content": "Review this code: def add(a,b): return a+b"}],
    "stream": false
  }'

Import a Custom GGUF Model

If you have a GGUF model file downloaded from HuggingFace or elsewhere, import it into Ollama with a Modelfile:

FROM C:\Users\Neeta\Downloads\my-finetuned-model.gguf

SYSTEM "You are a specialized assistant fine-tuned for customer support."

PARAMETER temperature 0.5
PARAMETER num_ctx 4096
ollama create my-custom-model -f Modelfile
ollama run my-custom-model

After the first build, Ollama stores the model locally so subsequent loads are instant. The GGUF file itself doesn’t need to remain at the original path.

READ ALSO  How to Install Ollama on Windows 10 & 11 — Step by Step Guide (2026)

Modelfile Best Practices

  • Be specific in system prompts — vague instructions produce inconsistent results. Tell the model exactly what to do and what to avoid.
  • Lower temperature for factual/code tasks — use 0.1–0.3 for code review, SQL, or JSON generation. Use 0.7–1.0 for creative writing.
  • Increase num_ctx for document tasks — if your model needs to read long inputs, increase the context window (uses more VRAM but necessary for accuracy).
  • Use few-shot MESSAGE examples — for specialized output formats (tables, structured reports), showing 2–3 examples in MESSAGE blocks dramatically improves consistency.
  • Name models descriptivelypython-reviewer is clearer than model1 when you have many custom models.
  • Version your Modelfiles — keep Modelfiles in a Git repository so you can track changes and roll back.

Frequently Asked Questions

Does changing the system prompt really make that big a difference?

Yes — it’s the single most impactful customization. The same base model (Llama 3.1, for example) produces dramatically different responses with different system prompts. A well-crafted system prompt can make a general model behave like a domain specialist, maintain a consistent persona, output structured formats, or avoid entire categories of responses.

What’s the difference between a Modelfile SYSTEM and sending a system message in the API?

Functionally: both set the system prompt that guides the model. The difference is persistence and convenience. A Modelfile bakes the system prompt permanently into the model — every call uses it without you having to include it each time. An API system message must be sent with every request. For applications that always use the same system prompt, a Modelfile is cleaner. For applications that need to switch system prompts dynamically, use the API.

How do I make the same system prompt work for multiple base models?

Create a separate Modelfile for each base model, keeping the SYSTEM prompt identical. You can script this to maintain them together — change the system prompt in one place, rebuild all variants. This is useful for A/B testing which base model performs best for your specific use case.


What to Read Next


Which custom model was most useful for your workflow? Share in the comments — I update this guide regularly with reader-submitted Modelfile examples.

About this guide: All Modelfile examples tested with Ollama 0.6.x using Llama 3.1 8B and Qwen2.5-Coder 7B on Windows 11 and Ubuntu 22.04. Last updated March 2026.

Leave a Reply

Scroll to Top