β Help us reach more developers and grow the Airweave community. Star this repo!
Airweave is a fully open-source tool that lets agents search any app. It connects to apps, productivity tools, databases, or document stores and transforms their contents into searchable knowledge bases, accessible through a standardized interface for agents.
The search interface is exposed via REST API or MCP. When using MCP, Airweave essentially builds a semantically searchable MCP server. The platform handles everything from auth and extraction to embedding and serving. You can find our documentation here.
πΊ Check out a quick demo of Airweave below:
Airweave.Demo.mp4
Managed Service: Airweave Cloud
Make sure docker and docker-compose are installed, then...
# 1. Clone the repository
git clone https://github.com/airweave-ai/airweave.git
cd airweave
# 2. Build and run
chmod +x start.sh
./start.sh
That's it! Access the dashboard at http://localhost:8080
- Access the UI at
http://localhost:8080
- Connect sources, configure syncs, and query data
- Swagger docs:
http://localhost:8001/docs
- Create connections, trigger syncs, and search data
pip install airweave-sdk
from airweave import AirweaveSDK
# Initialize client
client = AirweaveSDK(
api_key="YOUR_API_KEY",
base_url="http://localhost:8001"
)
# Create a collection
collection = client.collections.create(name="My Collection")
# Add a source connection
source = client.source_connections.create(
name="My Stripe Connection",
short_name="stripe",
readable_collection_id=collection.readable_id,
authentication={
"credentials": {"api_key": "your_stripe_api_key"}
}
)
# Semantic search (default)
results = client.collections.search(
readable_id=collection.readable_id,
query="Find recent failed payments"
)
# Hybrid search (semantic + keyword)
results = client.collections.search(
readable_id=collection.readable_id,
query="customer invoices Q4 2024",
search_type="hybrid"
)
# With query expansion and reranking
results = client.collections.search(
readable_id=collection.readable_id,
query="technical documentation",
enable_query_expansion=True,
enable_reranking=True,
top_k=20
)
# Search with recency bias (prioritize recent results)
results = client.collections.search(
readable_id=collection.readable_id,
query="critical bugs",
recency_bias=0.8, # 0.0 to 1.0, higher = more recent
limit=10
)
# Get AI-generated answer instead of raw results
answer = client.collections.search(
readable_id=collection.readable_id,
query="What are our customer refund policies?",
response_type="completion",
enable_reranking=True
)
npm install @airweave/sdk
# or
yarn add @airweave/sdk
import { AirweaveSDKClient, AirweaveSDKEnvironment } from "@airweave/sdk";
// Initialize client
const client = new AirweaveSDKClient({
apiKey: "YOUR_API_KEY",
environment: AirweaveSDKEnvironment.Local
});
// Create a collection
const collection = await client.collections.create({
name: "My Collection"
});
// Add a source connection
const source = await client.sourceConnections.create({
name: "My Stripe Connection",
shortName: "stripe",
readableCollectionId: collection.readableId,
authentication: {
credentials: { apiKey: "your_stripe_api_key" }
}
});
// Semantic search (default)
const results = await client.collections.search(
collection.readableId,
{ query: "Find recent failed payments" }
);
// Hybrid search (semantic + keyword)
const hybridResults = await client.collections.search(
collection.readableId,
{
query: "customer invoices Q4 2024",
searchType: "hybrid"
}
);
// With query expansion and reranking
const advancedResults = await client.collections.search(
collection.readableId,
{
query: "technical documentation",
enableQueryExpansion: true,
enableReranking: true,
topK: 20
}
);
// Search with recency bias (prioritize recent results)
const recentResults = await client.collections.search(
collection.readableId,
{
query: "critical bugs",
recencyBias: 0.8, // 0.0 to 1.0, higher = more recent
limit: 10
}
);
// Get AI-generated answer instead of raw results
const answer = await client.collections.search(
collection.readableId,
{
query: "What are our customer refund policies?",
responseType: "completion",
enableReranking: true
}
);
- Data synchronization from 30+ sources with minimal config
- Entity extraction and transformation pipeline
- Multi-tenant architecture with OAuth2
- Incremental updates using content hashing
- Semantic search for agent queries
- Versioning for data changes
- Frontend: React/TypeScript with ShadCN
- Backend: FastAPI (Python)
- Databases: PostgreSQL (metadata), Qdrant (vectors)
- Workers: Temporal (workflow orchestration), Redis (pub/sub)
- Deployment: Docker Compose (dev), Kubernetes (prod)
We welcome contributions! Please check CONTRIBUTING.md for details.
Airweave is released under the MIT license.
- Discord - Get help and discuss features
- GitHub Issues - Report bugs or request features
- Twitter - Follow for updates