The First Agent-Native
Data Platform

Ingest, validate, transform, store, and retrieve your data — whether you're an AI agent talking through MCP or a developer writing config. One platform for both.

$ datris ingest stock-price-data.csv --dest postgres \
    --ai-validate "all prices must be positive" \
    --ai-transform "uppercase all ticker symbols"
Pipeline created (stock_price_data)
Schema auto-detected (14 columns)
AI validation passed (all prices positive)
AI transform applied (12,847 rows)
Loaded to PostgreSQL
Done in 4.2s

Intelligence at every stage

Every step of your data pipeline is enhanced with AI. From ingestion to delivery, Datris makes data engineering accessible through natural language.

🔌
MCP Server
AI Agent Integration
First open-source pipeline with native Model Context Protocol. Agents can register pipelines, upload files, trigger jobs, profile data, and run searches.
AI Data Quality
Plain English Validation
Validate with plain English rules via aiRule. AI evaluates every row using reasoning and domain knowledge — no regex required.
AI Transformations
Natural Language Transforms
Describe row transforms in natural language. Date conversion, categorization, entity extraction — no code needed.
📐
AI Schema Generation
Auto-Config from Any File
Upload any CSV, JSON, or XML and get a complete dataset configuration auto-generated. Skip the boilerplate entirely.
📊
AI Data Profiling
Instant Data Insights
Upload a file and get summary statistics, quality issues, and suggested validation rules — all powered by AI analysis.
🔍
AI Error Explanation
Root Cause Analysis
When jobs fail, AI analyzes the error chain and explains the root cause in plain English. No more digging through stack traces.

Push and pull — one platform, two interfaces

AI agents and humans ingest data through the pipeline, store it across databases and vector stores, and retrieve it back — via MCP or API.

Sources
Data Upload
MinIO
DB Pull
Kafka
Processing
1 Preprocessor
2 Data Quality (AI)
3 Transformation (AI)
Storage
MinIO (Parquet/ORC)
PostgreSQL
MongoDB
Kafka
ActiveMQ
REST API
Qdrant
Weaviate
Milvus
Chroma
pgvector
Notification
Email · Slack · Webhook
Push
Create Pipelines
Upload Data
Trigger Processing
Configure Pipelines
MCP stdio · SSE
Claude Cursor OpenClaw Any MCP Agent
Pull
Query PostgreSQL
Query MongoDB
Semantic Search (Vector DB)
Monitor Jobs
Profile Data
Retrieve Results

Full RAG pipeline built in

Extract, chunk, embed, and upsert documents into any major vector database. Build retrieval-augmented generation workflows without leaving your pipeline.

✂️
Chunking Strategies
Choose the right chunking strategy for your use case:
Fixed-sizeSentenceParagraphRecursive
🧠
Embedding Providers
Generate embeddings with cloud or local models:
OpenAIOllama (local)
📄
Document Extraction
Extract text from virtually any document format:
PDFWordPowerPointExcelHTMLEmailEPUBPlain Text
RAG Pipeline Flow
1 Document Extraction
2 Chunking
3 Embeddings
4 Vector Upsert

Your AI agents are
first-class pipeline operators

Datris ships with a native MCP server. Claude, Cursor, OpenClaw, and any MCP-compatible AI agent can register pipelines, trigger jobs, search your data, and monitor pipelines — all through natural conversation.

Transports: stdioSSE (Server-Sent Events)
Compatible agents:
ClaudeCursorOpenClawAny MCP-compatible agent
MCP Capabilities
  • Register pipelines and configure schemas
  • Upload data for processing
  • Trigger and monitor pipeline jobs
  • Profile data and get AI insights
  • Semantic search across vector databases
  • Query PostgreSQL and MongoDB directly
Example prompt
"Ingest sales_q4.csv into the analytics database and validate that 'revenue must be positive and date must be in 2024'."

Speaks every data language

Ingest structured data, unstructured documents, and archives. Output to vector stores, structured stores, or optimized columnar formats.

Format Input Output
CSV
JSON
XML
Excel (.xlsx)
Parquet
ORC
PDF
Word (.docx)
PowerPoint (.pptx)
HTML
Email (.eml)
EPUB
Archives (.zip, .tar)
Plain Text

Your choice of AI model

Use cloud AI from Anthropic or OpenAI, or keep everything local with Ollama. No vendor lock-in — switch providers without changing your pipeline config.


How Datris compares

The only platform combining MCP-native agent access, AI-powered data quality and transformation, multi-destination pipelines, and RAG — in a single open-source package.

Capability Datris AirbyteFivetrandbtPrefectDagsterNiFiMeltano
MCP Server (native) 30+ tools
AI Data Quality
AI Transformation
Data Ingestion
Orchestration Config-driven ~ Limited ~ Limited
Vector DB / RAG 5 DBs
Open Source AGPL-3.0 Core Core
No-Code JSON config ~ UI UI SQL Python Python Visual CLI/YAML
Self-Hosted

Self-hosted on open source

Runs anywhere Docker does. Built on proven open-source infrastructure — no proprietary services, no vendor lock-in, no surprise bills.

$ git clone https://github.com/datris/datris-platform-oss.git
$ cp .env.example .env
# Add your API key (at least one required for AI features)
$ docker compose up -d
$ curl http://localhost:8080/api/v1/version

Clone. Configure. Launch. Your full pipeline in under a minute.


Connect your agent in 60 seconds

Start a free 14-day trial — no credit card required. Get a dedicated MCP endpoint, REST API, and full platform UI instantly.

Or send us a message: