Getting Started with ChromaDB on macOS: Install, Run, and Build a RAG Pipeline

Step‑by‑step guide to install ChromaDB on macOS, run it embedded or via Docker, and create a simple RAG pipeline with OpenAI or local embeddings.

What is Chroma DB?

Chroma DB (often just called Chroma) is an open‑source vector database built specifically for AI/LLM applications.
It stores dense embeddings (vectors) together with arbitrary metadata and provides ultra‑fast similarity search (k‑NN, cosine, inner‑product, etc.).

Feature Description
Collections Logical groups of vectors, metadata, and IDs – similar to tables in a relational DB.
Embeddings Any high‑dimensional float array (typically 256‑1536 dims).
Metadata JSON‑serialisable dicts that travel with each vector (e.g., source document, tags).
Indexes Default HNSW index for sub‑millisecond query latency; other back‑ends can be added.
Persistence Data lives on‑disk (SQLite + binary index files) or can be pointed at S3/GCS/Azure Blob.
Server mode Runs as a standalone HTTP server (REST API) – language‑agnostic.
CLI chromadb-cli lets you inspect collections, run queries, and export data.
License MIT‑licensed, actively maintained by the team behind LangChain.

Typical use‑cases include Retrieval‑Augmented Generation (RAG), semantic search, recommendation, few‑shot prompting, and hybrid vector‑+metadata queries.


Installing Chroma on macOS

You have three practical ways to get Chroma running on a Mac (Intel or Apple Silicon):

1️⃣ Python‑only (embedded) installation

Best for pure‑Python projects where you want Chroma to run in‑process.

# 1️⃣ Install a recent Python (≥3.9)
brew install python@3.11          # or use pyenv/conda

# 2️⃣ Create a virtual environment
python3 -m venv .venv
source .venv/bin/activate   # bash/zsh; use activate.fish for fish

# 3️⃣ Install Chroma (and optional CLI)
pip install --upgrade pip setuptools wheel
pip install chromadb            # core library
pip install "chromadb[cli]"     # adds `chromadb-cli`

# 4️⃣ Verify the installation
python - <<'PY'
import chromadb, uuid, numpy as np

client = chromadb.PersistentClient(path="./my_chroma_db")
coll = client.create_collection(name="demo")
ids = [str(uuid.uuid4()) for _ in range(3)]
emb = np.random.rand(3, 128).tolist()
coll.add(ids=ids, embeddings=emb, metadatas=[{'i': i} for i in range(3)])

res = coll.query(query_embeddings=[emb[0]], n_results=2)
print("Result →", res)
PY

Data location: ./my_chroma_db/.chroma/ (SQLite + index files). Change the path to any folder you like, even an external drive.

CLI example:

chromadb-cli list-collections
chromadb-cli show-collection demo

2️⃣ Docker‑based HTTP server

Ideal when you want a language‑agnostic endpoint or need process isolation.

Prerequisites

  • Install Docker Desktop for macOS (supports both Intel and Apple Silicon).

Run the official image

docker pull chromadb/chroma:latest

docker run -d \
  -p 8000:8000 \
  -v $(pwd)/chroma_data:/chroma/chroma \
  --name chroma_server \
  chromadb/chroma
  • -p 8000:8000 exposes the REST API on http://localhost:8000.
  • The bind‑mount $(pwd)/chroma_data makes the data persistent across container restarts.

Quick sanity check

curl -X GET http://localhost:8000/api/v1/collections
# → {"collections": []}

Access from Python

pip install "chromadb[client]"
import chromadb
client = chromadb.HttpClient(host="localhost", port=8000)

coll = client.create_collection(name="http_demo")
# use coll.add(), coll.query() just like the embedded version

Docker‑Compose (optional)

version: "3.9"
services:
  chroma:
    image: chromadb/chroma:latest
    container_name: chroma
    ports:
      - "8000:8000"
    volumes:
      - ./chroma_data:/chroma/chroma
    restart: unless-stopped

Run with docker compose up -d.

Apple Silicon tips

  • The image is multi‑arch; Docker Desktop pulls the arm64 variant automatically.
  • If you see exec format error, force a refresh: docker pull --platform linux/arm64/v8 chromadb/chroma.
  • Adjust Docker Desktop memory (Settings → Resources) – 4 GB is fine for modest workloads; 8‑12 GB for >1 M vectors.

3️⃣ Build from source (advanced)

Only needed if you plan to modify the engine.

git clone https://github.com/chroma-core/chroma.git
cd chroma
brew install poetry   # if you don’t have it
poetry install
poetry run python -m chromadb.server   # launches the HTTP server

Note: The index uses Rust under the hood. Ensure Xcode command‑line tools (xcode-select --install) and Rust (brew install rustup && rustup default stable) are installed.


A Minimal RAG Pipeline on macOS

Below is a self‑contained script that:

  1. Generates embeddings (OpenAI or a local Sentence‑Transformer).
  2. Persists them in Chroma.
  3. Retrieves the top‑3 most similar passages for a query.
# -------------------------------------------------
# pip install openai chromadb tqdm sentencepiece
# -------------------------------------------------
import os, uuid, tqdm
import chromadb
from chromadb.utils import embedding_functions
from openai import OpenAI

# -------- 1️⃣ Choose embedding source ----------
USE_OPENAI = True
if USE_OPENAI:
    client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
    embed_fn = embedding_functions.OpenAIEmbeddingFunction(
        api_key=os.getenv("OPENAI_API_KEY"),
        model_name="text-embedding-3-large"
    )
else:
    from sentence_transformers import SentenceTransformer
    model = SentenceTransformer("all-MiniLM-L6-v2")
    embed_fn = embedding_functions.SentenceTransformerEmbeddingFunction(
        model_name="all-MiniLM-L6-v2",
        model=model
    )

# -------- 2️⃣ Initialise Chroma -----------------
chroma_client = chromadb.PersistentClient(path="./rag_chroma")
collection = chroma_client.get_or_create_collection(
    name="documents",
    embedding_function=embed_fn,
)

# -------- 3️⃣ Load & ingest documents ----------
docs = [
    {"id": str(uuid.uuid4()), "text": "Python is a versatile language used for web development, data science, and automation."},
    {"id": str(uuid.uuid4()), "text": "The capital of France is Paris, known for its museums and cafés."},
    {"id": str(uuid.uuid4()), "text": "Machine learning includes supervised, unsupervised, and reinforcement learning techniques."},
    # …add as many as you need
]

ids   = [d["id"]   for d in docs]
texts = [d["text"] for d in docs]

collection.add(
    ids=ids,
    documents=texts,                       # “documents” is a reserved field name
    metadatas=[{"source": "my_corpus"} for _ in docs],
)

# -------- 4️⃣ Query the collection ------------
query = "Which city is the French capital?"
results = collection.query(
    query_texts=[query],
    n_results=3,
)

print("\n===== TOP 3 RESULTS =====")
for rank, (doc, dist) in enumerate(zip(results["documents"][0], results["distances"][0]), 1):
    print(f"{rank}. {doc} (distance: {dist:.4f})")

Running the script should print the Paris sentence as the top result, demonstrating a functional RAG loop.


Production‑Ready Tips for macOS

Recommendation How to apply
Persist data outside the project folder E.g., PersistentClient(path="/Volumes/SSD/chroma_data").
Regular backups cp -r my_chroma_db backup_$(date +%F) before major changes.
Set log level chromadb.Settings(log_level="INFO") helps debugging.
Metadata filtering collection.query(..., where={"source": "my_corpus"}) reduces unnecessary vector scans.
Control dimensionality Keep embeddings between 256‑768 dims for most LLMs – larger vectors waste RAM and increase index build time.
Scale out with S3 Use persist_directory="/s3/chroma" (requires boto3 creds) and run the Docker server on a cloud VM when you outgrow the laptop.
Monitor memory Docker Desktop → Resources → Memory. Allocate at least 4 GB for moderate workloads.
Use HTTP server for multi‑service architectures Allows JavaScript front‑ends, Go back‑ends, or micro‑services to share a single vector store.

Where to Learn More

Resource What you’ll find
Official docs https://docs.trychroma.com – API reference, server guide, tutorials.
GitHub repo https://github.com/chroma-core/chroma – source code, issue tracker, contribution guide.
LangChain RAG tutorial https://python.langchain.com/docs/modules/data_connection/vectorstores/chroma – step‑by‑step LangChain integration.
Discord / Slack community Active chat rooms for troubleshooting and feature discussions.
Blog post “Why we built Chroma” Insight into design decisions vs. Pinecone, Weaviate, etc.
YouTube introductions Short 10‑minute videos walk through server vs. client modes.

TL;DR Checklist

  1. Install Python (brew install python@3.11).
  2. Create & activate a virtualenv.
  3. pip install chromadb (add [cli] for the command‑line tool).
  4. Run a quick sanity test (code snippet above).
  5. Optional: Run the Docker server (docker run -p 8000:8000 -v $(pwd)/chroma_data:/chroma/chroma chromadb/chroma).
  6. Start building – generate embeddings, collection.add, then collection.query.

You now have a fully functional, locally‑hosted vector store ready for any LLM‑powered application on your Mac. Happy embedding! 🚀

Made with chatblogr.com