Team kept asking me the same questions about our codebase. "How does the auth system work?", "Where's the API documentation?", "What's the config format?". I'd explain it, they'd forget, repeat next week.
Built a chatbot trained on our actual code and docs using Embedchain. Took about 4 hours total. Now they ask the bot instead of me, and it actually gives correct answers with code examples.
What Embedchain actually does
Embedchain is an open-source framework for building AI chatbots with your own data. It handles all the RAG (Retrieval Augmented Generation) stuff - chunking documents, creating embeddings, storing them in a vector database, and retrieving relevant chunks when someone asks a question.
What I like about it: simpler than LangChain, good defaults, supports lots of data sources out of the box (PDFs, websites, code repos, Notion, etc.). You can get something working in like 20 lines of code.
Basic setup (that actually works)
Install Embedchain and dependencies:
pip install embedchain openai
Set your OpenAI API key:
export OPENAI_API_KEY="sk-..."
Create your first bot:
from embedchain.bot import OpenSourceApp
# Create app with default settings
app = OpenSourceApp()
# Add your data
app.add_local("./docs") # folder with documents
app.add_local("./code") # your codebase
# Ask questions
while True:
question = input("You: ")
answer = app.query(question)
print(f"Bot: {answer}")
This uses OpenAI's embeddings by default. Costs money but works well. Free alternatives exist (more on that later).
Adding different data sources
Embedchain supports tons of data types:
from embedchain.bot import OpenSourceApp
app = OpenSourceApp()
# Local files
app.add_local("./my_document.pdf")
app.add_local("./website.html")
# Web scraping
app.add_remote("https://example.com/docs")
app.add_remote("https://github.com/username/repo")
# Databases
app.add_local("postgresql://user:pass@localhost/db")
# Notion, Confluence, etc.
app.add_remote("notion_page_url")
The web scraping is actually decent. Handles Markdown, code blocks, tables better than I expected.
Configuration that matters
Default settings are okay, but these help:
from embedchain.bot import OpenSourceApp
from embedchain.config import AppConfig, ChunkerConfig
app_config = AppConfig(
log_level="INFO", # see what's happening
)
chunker_config = ChunkerConfig(
chunk_size=500, # smaller chunks = more precise but slower
chunk_overlap=50, # overlap helps maintain context
length_function=len,
)
app = OpenSourceApp(
config=app_config,
chunker_config=chunker_config
)
Chunk size depends on your use case. Code documentation works well with 300-500. Long-form docs can go 1000+.
Real GitHub issues and how I fixed them
Issue #1: "Vector database connection timeout"
Problem: Bot kept timing out when trying to connect to ChromaDB. Error: "chromadb.api.exceptions.ConnectionError: Failed to connect to server"
What I tried: Restarted ChromaDB, checked ports, reinstalled everything. Nothing worked.
Actual fix: The issue was Embedchain was trying to use a remote ChromaDB instance when I wanted local. Fixed by explicitly setting the config:
from embedchain.vectordb.chroma import ChromaDB
vectordb_config = {
"provider": "chroma",
"config": {
"collection_name": "my_docs",
"dir": "./chroma_db", # explicit local directory
"allow_reset": True
}
}
app = OpenSourceApp(vectordb_config=vectordb_config)
Source: GitHub issue #847 - multiple people hit this, solution was buried in comments
Issue #2: "Large files cause memory errors"
Problem: Processed fine with small PDFs, but crashed on 50MB+ files. Error: " Killed " with no useful message.
What I tried: Increased chunk size, decreased chunk size. Nothing helped.
Actual fix: The PDF loader was loading the entire file into memory. Split large files before processing:
import PyPDF2
from pathlib import Path
def split_large_pdf(pdf_path, max_size_mb=10):
"""Split large PDFs into smaller chunks"""
pdf_reader = PyPDF2.PdfReader(pdf_path)
total_pages = len(pdf_reader.pages)
# Calculate pages per chunk
chunk_size = total_pages // (Path(pdf_path).stat().st_size // (max_size_mb * 1024 * 1024) + 1)
current_chunk = []
chunk_number = 0
for i, page in enumerate(pdf_reader.pages):
current_chunk.append(page)
if len(current_chunk) >= chunk_size or i == total_pages - 1:
# Save chunk
writer = PyPDF2.PdfWriter()
for page in current_chunk:
writer.add_page(page)
chunk_path = f"{pdf_path.stem}_chunk{chunk_number}.pdf"
with open(chunk_path, "wb") as f:
writer.write(f)
current_chunk = []
chunk_number += 1
# Then add chunks
split_large_pdf("large_document.pdf")
for i in range(4): # adjust based on your split
try:
app.add_local(f"large_document_chunk{i}.pdf")
except:
pass
Source: GitHub issue #923 - no official fix, this workaround helped me
Issue #3: "Bot gives wrong answers confidently"
Problem: Bot would answer questions with completely wrong info, but sound confident about it. Hallucinating basically.
What I tried: Added more documents, re-embedded everything. Made it worse.
Actual fix: Two things helped:
# 1. Add a system prompt that encourages honesty
system_prompt = """You are a helpful assistant for answering questions
about our codebase. If you don't know something based on the context,
say 'I don't have enough information to answer that' instead of making things up."""
# 2. Use hybrid search (better than pure similarity)
from embedchain.config import RetrievalConfig
retrieval_config = RetrievalConfig(
search_type="hybrid", # combines keyword + semantic search
k=3, # only use top 3 results
score_threshold=0.3, # minimum similarity score
)
app = OpenSourceApp(
system_prompt=system_prompt,
retrieval_config=retrieval_config
)
Source: GitHub issue #1012 - hybrid search was the real game changer
Issue #4: "Embeddings cost too much"
Problem: OpenAI embeddings were costing $50+ per month for our docs.
What I tried: Caching embeddings locally. Helped but still needed to re-embed updates.
Actual fix: Switched to local embeddings:
from embedchain.embedder.gpt4all import GPT4AllEmbedder
# Use local embeddings (free but slower)
embedder_config = {
"provider": "gpt4all",
"config": {
"model": "all-MiniLM-L6-v2.gguf2.f16.gguf"
}
}
app = OpenSourceApp(embedder_config=embedder_config)
# Or use HuggingFace (free, runs in cloud)
from embedchain.embedder.huggingface import HuggingFaceEmbedder
embedder_config = {
"provider": "huggingface",
"config": {
"model": "sentence-transformers/all-MiniLM-L6-v2"
}
}
app = OpenSourceApp(embedder_config=embedder_config)
Source: GitHub issue #756 - HuggingFace option was added recently, works well
Building a proper app structure
Here's how I structure real projects:
# bot.py
from embedchain.bot import OpenSourceApp
from embedchain.config import AppConfig, RetrievalConfig
from pathlib import Path
import os
class DocsBot:
def __init__(self, docs_path="./docs"):
self.app = OpenSourceApp(
config=AppConfig(
log_level="INFO",
collect_metrics=False, # don't phone home
),
retrieval_config=RetrievalConfig(
search_type="hybrid",
k=3,
)
)
# Load docs if vector DB doesn't exist
if not os.path.exists("./chroma_db"):
print("Indexing documents...")
self.add_documents(docs_path)
def add_documents(self, path):
"""Add documents from a path"""
for file in Path(path).rglob("*.md"):
try:
self.app.add_local(str(file))
print(f"Added {file}")
except Exception as e:
print(f"Failed to add {file}: {e}")
def query(self, question):
"""Ask the bot a question"""
response = self.app.query(question)
return response
def chat(self):
"""Interactive chat interface"""
print("Docs Bot ready! Ask questions about your code.")
print("Type 'quit' to exit")
while True:
question = input("\nYou: ")
if question.lower() == 'quit':
break
answer = self.query(question)
print(f"\nBot: {answer}")
if __name__ == "__main__":
bot = DocsBot("./docs")
bot.chat()
Adding a web interface
Simple Flask app:
# web.py
from flask import Flask, render_template, request, jsonify
from bot import DocsBot
app = Flask(__name__)
bot = DocsBot()
@app.route("/")
def home():
return render_template("chat.html")
@app.route("/query", methods=["POST"])
def query():
data = request.json
question = data.get("question")
if not question:
return jsonify({"error": "No question provided"}), 400
try:
answer = bot.query(question)
return jsonify({"answer": answer})
except Exception as e:
return jsonify({"error": str(e)}), 500
if __name__ == "__main__":
app.run(debug=True, port=5000)
Simple HTML template:
Docs Bot
📚 Ask about our codebase
Common gotchas
- Vector DB corruption: Sometimes ChromaDB gets corrupted. Just delete ./chroma_db and re-index.
- Slow first query: First query loads everything into memory. Subsequent ones are fast.
- Rate limiting: OpenAI API has rate limits. If you hit them, switch embeddings or add delays.
- Memory usage: Large document sets need lots of RAM. 4GB docs = ~8GB RAM minimum.
- Updating docs: No easy way to update individual documents. Re-index is safest.
What works well
- Documentation Q&A - actually works as advertised
- Codebase assistants - understands code structure pretty well
- Customer support bots - trains on your knowledge base
- Research helpers - ingests papers and answers questions
- Simple setup compared to LangChain - less boilerplate
What doesn't work well
- Real-time data - needs manual re-indexing
- Very large datasets (>10GB) - gets slow and expensive
- Multi-language support - English works best
- Complex reasoning - still limited by the underlying LLM
- Fine-grained updates - can't update individual docs easily
Bottom line
Embedchain isn't perfect. The GitHub issues show plenty of people struggling with edge cases. But for basic RAG applications - documentation bots, code assistants, knowledge base Q&A - it works surprisingly well.
Setup is way simpler than rolling your own LangChain pipeline. Default embeddings work fine for most use cases. Web scraping and document loading actually work.
Just expect to hit some weird issues. Check GitHub issues when you do - chances are someone else already fixed it.
Links: github.com/embedchain/embedchain | Docs: docs.embedchain.ai