Skip to main content

🤖 Working with AI Models in Libre WebUI

This guide explains how to use AI models in Libre WebUI. Whether you're new to AI or an experienced user, this guide will help you get the most out of your local AI assistant.

Reading Time

~8 minutes - Complete guide from basics to advanced model management

🎯 What You Can Do

Libre WebUI supports all the features that modern AI assistants offer:

💬 Chat & Conversations

  • Have natural conversations with AI models
  • Get streaming responses (words appear as they're generated)
  • Use advanced settings like temperature and creativity controls
  • Create custom system prompts to change the AI's personality

🖼️ Vision & Images

  • Upload images and ask questions about them
  • Analyze charts, diagrams, and photographs
  • Get help with visual tasks like describing scenes or reading text in images

📝 Structured Responses

  • Request responses in specific formats (JSON, lists, etc.)
  • Get organized summaries and analysis
  • Use predefined templates for common tasks

🛠️ Model Management

  • Download and manage AI models locally
  • Switch between different models for different tasks
  • Monitor model performance and memory usage

🧠 AI Models Guide

Perfect for newcomers and everyday use:

ModelSizeMemoryBest For
gemma3:4b~4GB8GB RAMMost users - Best balance
llama3.2:1b~1GB4GB RAMFast responses - Ultra-quick
phi4:14b~14GB16GB RAMPower users - Microsoft's best
Recommended Starting Point

gemma3:4b is currently the best single-GPU model for most users. Great balance of speed and intelligence.

Advanced Models

  • llama3.3:70b - State-of-the-art performance (40GB+)
  • deepseek-r1:32b - Advanced reasoning (16GB)
  • devstral:24b - Best for coding (12GB)
  • qwen2.5vl:32b - Image understanding (16GB)

Model Sizes

  • 1B = ~1GB memory, ultra-fast
  • 4B = ~3GB memory, great balance
  • 14B = ~8GB memory, powerful
  • 32B+ = 16GB+ memory, professional use

🚀 Getting Started with Models

Step 1: Download Your First Model

  1. Go to the Models section in the sidebar
  2. Click "Pull Model"
  3. Enter a model name like gemma3:4b
  4. Wait for the download to complete

Step 2: Start Chatting

  1. Go back to the Chat section
  2. You'll see your model is now available
  3. Type a message and press Enter
  4. Watch the AI respond in real-time!

Step 3: Try Advanced Features

  • Upload an image (with vision models like qwen2.5vl:32b)
  • Adjust settings like creativity and response length
  • Create custom prompts to change the AI's behavior

🎨 Creative Use Cases

Writing Assistant

"Help me write a professional email to..."
"Proofread this document and suggest improvements"
"Create a story outline about..."

Learning & Research

"Explain quantum physics in simple terms"
"What are the pros and cons of..."
"Help me understand this concept by giving examples"

Programming Helper (with devstral:24b)

"Create a complete web application with authentication"
"Debug this complex codebase and suggest improvements"
"Build an autonomous coding agent for this project"

Image Analysis (with qwen2.5vl:32b)

"What's in this image and what does it mean?"
"Extract all text from this document accurately"
"Analyze this complex chart and provide insights"

Advanced Reasoning (with deepseek-r1:32b)

"Think through this complex problem step by step"
"What are the hidden implications of this decision?"
"Solve this multi-step logical puzzle"

⚙️ Advanced Features

Custom System Prompts

Change how the AI behaves by setting a system prompt:

"You are a helpful programming tutor. Always explain concepts step by step."
"You are a creative writing assistant. Help me brainstorm ideas."
"You are a professional editor. Focus on clarity and grammar."

Structured Outputs

Ask for responses in specific formats:

"List the pros and cons in JSON format"
"Give me a summary with bullet points"
"Create a table comparing these options"

Temperature & Creativity

  • Low temperature (0.1-0.3): Focused, consistent responses
  • Medium temperature (0.5-0.7): Balanced creativity and coherence
  • High temperature (0.8-1.0): More creative and varied responses

🔍 Model Capabilities Reference

Model TypeBest ForMemory NeededSpeed
Compact EfficiencyQuick tasks, edge devices1-3GBUltra-Fast
Single-GPU ChampionsBalanced performance, daily use3-8GBFast
Reasoning SpecialistsComplex thinking, problem-solving12-16GBMedium
Vision-LanguageImage analysis, multimodal tasks8-32GBMedium
Coding AgentsAdvanced programming, debugging12-24GBMedium
State-of-the-ArtMaximum capability, research32GB+Slower

💡 Tips for Better Results

Writing Better Prompts

  • Be specific: "Write a 200-word summary" vs "Summarize this"
  • Give context: "I'm a beginner" or "I'm an expert in..."
  • Ask for examples: "Show me examples of..."
  • Specify format: "Give me a numbered list" or "Explain step by step"

Managing Performance

  • Use smaller models for simple tasks to save memory
  • Switch models based on your current task
  • Monitor memory usage in the Models section
  • Keep frequently used models loaded for faster responses

Privacy & Security

Your data never leaves your computerNo internet connection required (after downloading models) ✅ Full control over your conversationsNo tracking or data collection

🆘 Troubleshooting

Model won't download?

  • Check your internet connection
  • Make sure you have enough disk space
  • Try a smaller model first

Responses are slow?

  • Try a smaller model (1B or 3B parameters)
  • Close other applications to free up memory
  • Check if multiple models are loaded

AI gives strange responses?

  • Adjust the temperature setting
  • Try rephrasing your question
  • Use a different model for your task

Ready to explore? Head to the Quick Start Guide to get your first conversation going!