Skip to main content

Getting Started: A Novice-Friendly Guide to Running Local AI with Ollama and AnythingLLM

This guide will walk you through setting up a powerful, private, and user-friendly local AI environment on your computer. You’ll use Ollama (to run local large language models), AnythingLLM (for a modern GUI with chat threads and search), and can supercharge your retrieval system with the Nomic Embed model for local document search (RAG). All steps are designed for users with minimal technical experience.

Why This Setup?

  • Fully Local and Private: All data stays on your device.
  • Modern, Easy Interface: AnythingLLM gives you a ChatGPT-like GUI.
  • Flexible: Use local models for privacy, or connect to SOTA models via Azure if needed.
  • Powerful Search: The Nomic embedder makes your document search smarter and faster.

Step 1: Install Ollama

Ollama lets you run language models (LLMs) locally.

  1. Download Ollama:
    Go to https://ollama.com/download and choose the installer for your system (Mac/Windows/Linux).
  2. Install:
    Run the installer and follow the prompts.

Step 2: Run Ollama

  1. Open Your Terminal (Command Prompt on Windows, Terminal on Mac/Linux).
  2. Start the Ollama server:
    1. ollama serve

If Ollama is already running, it will tell you.


Step 3: Choose and Download a Model

  • High-end computers (e.g., Mac M3/M4 with 64GB+ RAM):
    • ollama run gemma3:27b
  • Most laptops/desktops:
    • ollama run gemma3:4b

(You can experiment with gemma3:12b or llama3:8b if your system handles the 4b model smoothly.)

This command will download and load the model. The first load is slow; subsequent uses are fast.


Step 4: Install AnythingLLM (GUI)

AnythingLLM is a desktop app to chat with your models in a user-friendly way.

  1. Download AnythingLLM Desktop:
    Go to https://anythingllm.com/desktop and download the installer for your OS.
  2. Install and Launch:
    Run the installer, then open AnythingLLM.

Step 5: Connect AnythingLLM to Ollama

  1. In AnythingLLM, go to Settings.
  2. Select Ollama as your model provider.
  3. Choose your preferred model (e.g., gemma3:4b or llama3:8b) from the dropdown.
  4. Save settings.

Step 6: (Optional, Advanced) Supercharge Your Document Search (RAG) with Nomic Embedder

If you want smarter document search (Retrieval-Augmented Generation):

  1. Install Nomic Embed Model:
    Visit https://www.nomic.ai/embed and follow the setup instructions for your OS.
  2. Configure AnythingLLM to use Nomic Embed:
    • In AnythingLLM's settings, look for the “Embedder” section.
    • Switch from the default embedder to the Nomic embedder (you may need to specify the path or endpoint).
    • Set Chunk Size to 2000 and Overlap to 200 for optimal performance with larger documents.

Step 7: Start Chatting and Exploring

  • Chat: Use the AnythingLLM GUI to ask questions, try threads, and manage documents.
  • Document Search: Upload PDFs or text files; AnythingLLM will help you search and summarize them using your local LLM and the powerful Nomic embedder.

Tips & Troubleshooting

  • The first time you load a model, it may take a while (especially for larger models).
  • If you get “out of memory” errors, try a smaller model.
  • For the most privacy, use only local models. For SOTA access, AnythingLLM can be configured to use your secure Azure subscription (if you have one).

Summary

  • Install Ollama (engine for local AI models)
  • Run Ollama (ollama serve)
  • Download/Run a Model (e.g., ollama run gemma3:4b)
  • Install AnythingLLM Desktop (easy GUI)
  • Connect AnythingLLM to Ollama in settings
  • (Optional) Configure Nomic Embed for better RAG(document search)
  • Start using your private, secure, and powerful local AI!

Need help?

Happy experimenting! This setup gives you a secure, private, and highly functional AI assistant, right on your own device.

Follow Institute for Artificial Intelligence in Medicine on BlueskyLinkedIn