Getting Started¶
Get Seamless-RAG running and execute your first query in under 5 minutes.
Prerequisites¶
- Python 3.12+
- Docker (for MariaDB)
- conda (recommended) or virtualenv
Installation¶
git clone https://github.com/SunflowersLwtech/seamless-rag.git
cd seamless-rag
# Create environment
conda create -n seamless-rag python=3.12 -y
conda activate seamless-rag
# Install with all extras
pip install -e ".[dev,mariadb,embeddings]"
Start MariaDB¶
docker compose up -d
This starts MariaDB 11.8 with VECTOR support. Default credentials are in docker-compose.yml.
Initialize and Ingest¶
# Create the schema (documents + chunks tables with VECTOR columns)
seamless-rag init
# Ingest text files — split at paragraph boundaries
seamless-rag ingest ./data/articles/
Your First Query¶
seamless-rag ask "What are the key findings?"
The output includes the LLM answer, TOON-formatted context, and a token comparison showing savings vs JSON.
Embed Existing Tables¶
If your data is already in MariaDB, skip ingest and embed directly:
# Single column (default)
seamless-rag embed --table products --column description
# Multi-column — concatenated for richer vector search
seamless-rag embed --table products --columns "name,category,price,rating"
# Internally: "Widget — Tools — 29.99 — 4.5"
# Now semantic + SQL filter queries work
seamless-rag ask "cheap high-rated tools" --where "price < 50"
Python API¶
from seamless_rag import SeamlessRAG
with SeamlessRAG(host="localhost", database="seamless_rag") as rag:
rag.init()
# Single-column embed
rag.embed_table("articles", text_column="content")
# Multi-column embed — richer semantics
rag.embed_table("products", text_column=["name", "category", "price"])
# Semantic search
result = rag.ask("What are the main topics?")
print(result.answer)
print(f"Saved {result.savings_pct:.1f}% tokens")
# Hybrid: semantic + SQL filter
result = rag.ask("affordable tools", where="price < 50", mmr=True)
# Export any SQL query as TOON
toon = rag.export("SELECT name, price FROM products ORDER BY price LIMIT 10")
print(toon)
SQL Export¶
Convert any SELECT query result to TOON format:
seamless-rag export "SELECT id, title, avg_rating FROM movies ORDER BY avg_rating DESC LIMIT 10"
Output:
[10,]{id,title,avg_rating}:
1,"Shawshank Redemption, The (1994)",4.43
2,"Godfather, The (1972)",4.29
3,Fight Club (1999),4.27
...
Web UI¶
Launch a Gradio web interface with 6 tabs:
seamless-rag web # localhost only
seamless-rag web --share # public link (requires auth env vars)
Tabs: Ask, Benchmark, JSON → TOON, SQL Export, Data, Status.
Watch Mode¶
Monitor a table for new rows and auto-embed them:
seamless-rag watch --table articles --column content --interval 2
Features: high-water mark tracking, exponential backoff retry, error isolation, Rich live display.
Configuration¶
Configure via environment variables or a .env file:
# .env
MARIADB_HOST=127.0.0.1
MARIADB_PASSWORD=seamless
EMBEDDING_PROVIDER=sentence-transformers # or gemini, openai, ollama
LLM_PROVIDER=ollama # or gemini, openai
See Providers for provider-specific setup.
Running Tests¶
make test-all # lint + unit + spec (no Docker needed)
make test-full # all suites including integration
make score # quality dashboard