Why Enrich.sh
Why another data pipeline? Because the existing ones are built for a world that no longer exists.
Typical Modern Data Pipeline
┌──────────────────┐
│ Event Producers │
│──────────────────│
│ Frontend │
│ Backend APIs │
│ Mobile Apps │
│ AI Inference │
└─────────┬────────┘
│
▼
┌──────────────────┐
│ Ingestion Layer │
│──────────────────│
│ Kafka │
│ HTTP Collectors │
│ Custom Loaders │
│ Segment │
│ Logstash │
└─────────┬────────┘
│
▼
┌──────────────────┐
│ Stream Processing│
│──────────────────│
│ Kafka Streams │
│ Flink │
│ Spark Streaming │
│ Custom Python │
└─────────┬────────┘
│
▼
┌──────────────────┐
│ Validation / │
│ Schema Control │
│──────────────────│
│ JSON Schema │
│ Data Contracts │
│ Custom Checks │
└─────────┬────────┘
│
▼
┌──────────────────┐
│ Storage Layer │
│──────────────────│
│ S3 / R2 │
│ Parquet │
│ Delta Lake │
└─────────┬────────┘
│
▼
┌──────────────────┐
│ Warehouse / OLAP │
│──────────────────│
│ Snowflake │
│ BigQuery │
│ ClickHouse │
└─────────┬────────┘
│
▼
┌──────────────────┐
│ Analytics / ML │
│──────────────────│
│ dbt │
│ BI Tools │
│ Training Jobs │
└──────────────────┘6–12 months of engineering. 3+ full-time engineers. $10K+/month in infra.
And every time a source system changes a field, your pipeline breaks at 2 AM.
With Enrich.sh
We collapse the ingestion, processing, validation, and storage layers into one:
┌──────────────────┐
│ Event Producers │
└─────────┬────────┘
│
▼
┌─────────────────────────────────────┐
│ enrich.sh │
│─────────────────────────────────────│
│ ✓ HTTP ingestion │
│ ✓ Schema definition │
│ ✓ Validation (flex/evolve/strict) │
│ ✓ Dead letter queue │
│ ✓ Stream mapping │
│ ✓ Enrichment (UA, Geo, IP) │
│ ✓ Partitioned Parquet to R2 │
└─────────┬───────────────────────────┘
│
▼
┌──────────────────┐
│ Warehouse / OLAP │
└──────────────────┘No Kafka. No Flink. No Airflow. No connectors. No sync jobs.
Proof Points
| Metric | Value |
|---|---|
| Ingest latency | <50ms (p99, 300+ edge locations) |
| Throughput | 5,000 events/sec per stream. Ask us for more. |
| Storage format | Parquet (Snappy compression, ~10x smaller than JSON) |
| Cold start | <50ms zero JVM warm-up |
| Warehouse support | ClickHouse, BigQuery, DuckDB, Snowflake, Spark |
| Protocol | HTTPS POST — works from anywhere |
Who Uses This
Adtech & Data Companies
Track ad impressions, conversions, and attribution events across millions of daily requests. Schema evolve mode detects when ad networks change their callback formats.
curl -X POST https://enrich.sh/ingest \
-H "Authorization: Bearer sk_live_your_key" \
-d '{
"stream_id": "impressions",
"data": [{
"ad_id": "ad_9x7k",
"campaign": "retarget_q1",
"placement": "feed_top",
"bid_price": 0.42,
"ts": 1738776000
}]
}'AI & ML Teams
Log inference results, model inputs/outputs, and training metrics. Replay historical data for model retraining.
curl -X POST https://enrich.sh/ingest \
-H "Authorization: Bearer sk_live_your_key" \
-d '{
"stream_id": "inferences",
"data": [{
"model_id": "gpt-4o-mini",
"prompt_tokens": 1250,
"completion_tokens": 340,
"latency_ms": 892,
"user_id": "user_abc",
"ts": 1738776000
}]
}'IoT & Sensor Data
Ingest telemetry from thousands of devices. Evolve mode auto-adapts when new device types send different fields.
curl -X POST https://enrich.sh/ingest \
-H "Authorization: Bearer sk_live_your_key" \
-d '{
"stream_id": "sensors",
"data": [{
"device_id": "temp_sensor_042",
"reading": 23.7,
"unit": "celsius",
"battery": 0.89,
"location": {"lat": 52.52, "lng": 13.405},
"ts": 1738776000
}]
}'Product Analytics & Logs
Track user behavior, feature usage, and application logs without Segment's pricing.
curl -X POST https://enrich.sh/ingest \
-H "Authorization: Bearer sk_live_your_key" \
-d '{
"stream_id": "product_events",
"data": [{
"event": "feature_activated",
"feature": "dark_mode",
"user_id": "user_789",
"plan": "pro",
"ts": 1738776000
}]
}'vs. The Alternatives
| Enrich.sh | Segment | RudderStack | DIY (Kafka + Flink) | |
|---|---|---|---|---|
| Setup time | 5 minutes | 1 hour | 1 day | 3–6 months |
| Monthly cost (10M events) | $49 | $1,200+ | $500+ | $2,000+ infra |
| Infrastructure | Zero (serverless) | Managed | Self-host or cloud | Self-managed |
| Data format | Parquet (open) | Proprietary | JSON/Parquet | Your choice |
| Warehouse access | Direct S3 read | Sync connectors | Sync connectors | Custom ETL |
| Schema enforcement | Flex / Evolve / Strict | Basic | Basic | Manual |
| Dead Letter Queue | Built-in | ❌ | ❌ | Build it yourself |
| Vendor lock-in | None — files are Parquet on S3 | High | Medium | Low |
Pain Trigger → Feature Map
| "We're dealing with..." | Enrich.sh solves it with |
|---|---|
| Running Kafka just for event logging | Direct HTTP ingest → Parquet. No brokers. |
| Pipeline breaks when sources change fields | evolve mode detects schema drift automatically |
| Paying $1K+/mo for Segment | Same functionality, 10x cheaper |
| Building custom S3 writers + Flink jobs | Built-in buffering, batching, and Parquet compression |
| No visibility into failed events | Dead Letter Queue — nothing is lost |
| Can't replay historical data | Stream Replay API — re-send any time range |
| Connecting warehouse to data | Dashboard → Connect — copy-paste SQL for any warehouse |
| GA4 sampling ruining analytics | Raw event data, no sampling, you own the data |
How It Works
1. Send Events
POST JSON to /ingest. From any language, any platform, any edge.
2. We Enrich & Store
Events are buffered, enriched with geo/device/session data, compressed as Parquet, and flushed to your dedicated R2 bucket.
3. Query From Your Warehouse
Connect ClickHouse, BigQuery, DuckDB, or Snowflake directly to your bucket. No sync jobs. No connectors.
Get Started
# 1. Create a stream
curl -X POST https://enrich.sh/streams \
-H "Authorization: Bearer sk_live_your_key" \
-d '{ "stream_id": "events", "schema_mode": "evolve" }'
# 2. Send events
curl -X POST https://enrich.sh/ingest \
-H "Authorization: Bearer sk_live_your_key" \
-d '{ "stream_id": "events", "data": [{ "event": "signup", "plan": "pro" }] }'
# 3. Query with DuckDB
# duckdb -c "SELECT * FROM read_parquet('s3://enrich-you/events/2026/**/*.parquet')"