Getting Started: A Novice-Friendly Guide to Running Local AI With Ollama and AnythingLLM
This guide will walk you through setting up a powerful, private and user-friendly local AI environment on your computer. You'll use Ollama (to run local large language models) and AnythingLLM (for a modern GUI with chat threads and search) and can supercharge your retrieval system with the Nomic Embed model for local document search (Retrieval-Augmented Generation, or RAG). All steps are designed for users with minimal technical experience.
Note: Models like these can be very large, enough to fill up all available storage space on a typical computer, so please consider the space you have available before you set up a local AI. One way to check the size of a model is to visit the Ollama library and select a category (e.g., deepseek-r1, gemma3n).
This setup:
- Is fully local and private: All data stays on your device.
- Offers a modern, easy interface: AnythingLLM gives you a ChatGPT-like GUI.
- Is flexible: Use local models for privacy, or connect to SOTA models via Azure if needed.
- Has a powerful search: The Nomic embedder makes your document search smarter and faster.
Setup
Step 1: Install Ollama
Ollama lets you run language models (LLMs) locally.
- Download Ollama: Go to https://ollama.com/download and choose the installer for your system (Mac/Windows/Linux).
- Install: Run the installer and follow the prompts.
Step 2: Run Ollama
- Open Your Terminal (Command Prompt on Windows, Terminal on Mac/Linux).
- Start the Ollama server: ollama serve.
If Ollama is already running, it will tell you.
Step 3: Choose and Download a Model
- High-end computers (e.g., Mac M3/M4 with 64GB+ RAM): ollama run gemma3:27b
- Most laptops/desktops: ollama run gemma3:4b
You can experiment with gemma3:12b or llama3:8b if your system handles the 4b model smoothly.
This command will download and load the model. The first load is slow; subsequent uses are fast.
Step 4: Install AnythingLLM (GUI)
AnythingLLM is a desktop app to chat with your models in a user-friendly way.
- Download AnythingLLM Desktop: Go to https://anythingllm.com/desktop and download the installer for your OS.
- Install and launch: Run the installer, then open AnythingLLM.
Step 5: Connect AnythingLLM to Ollama
- In AnythingLLM, go to Settings.
- Select Ollama as your model provider.
- Choose your preferred model (e.g., gemma3:4b or llama3:8b) from the dropdown.
- Save settings.
Step 6: Supercharge Your Document Search With Nomic Embedder (Optional/Advanced)
If you want smarter document search (RAG):
- Install Nomic Embed Model: Visit https://www.nomic.ai/embed and follow the setup instructions for your OS.
- Configure AnythingLLM to use Nomic Embed:
- In AnythingLLM's settings, look for the "Embedder" section.
- Switch from the default embedder to the Nomic embedder (you may need to specify the path or endpoint).
- Set Chunk Size to 2000 and Overlap to 200 for optimal performance with larger documents.
Step 7: Start Chatting & Exploring
- Chat: Use the AnythingLLM GUI to ask questions, try threads and manage documents.
- Document Search: Upload PDFs or text files; AnythingLLM will help you search and summarize them using your local LLM and the powerful Nomic embedder.
Tips & Troubleshooting
- The first time you load a model, it may take a while (especially for larger models).
- If you get “out of memory” errors, try a smaller model.
- For the most privacy, use only local models. For SOTA access, AnythingLLM can be configured to use your secure Azure subscription (if you have one).
Need help?
- Visit Ollama Documentation
- Visit AnythingLLM Support
- Reach out to your group's AI lead for personalized assistance.
Created by David Liebovitz, MD.