Documentation

Technical Documentation

Everything you need to understand Vesh AI's internals, integrate with the API, and evaluate security posture.

Overview

Vesh AI is an intelligence layer that sits on top of your existing data systems. It connects via read-only access to your billing, product, and CRM databases, resolves customer entities across sources, computes canonical SaaS metrics, detects anomalies, and delivers actionable insights to Slack with full confidence scoring and data lineage.

No data warehouse required

Vesh queries your systems in-place. No ETL, no staging tables, no Snowflake dependency.

Trust by default

Every metric has a confidence score. Every insight links back to source records through full lineage.

Proactive delivery

Insights arrive in Slack before you ask. Daily briefs, anomaly alerts, and weekly digests.

System Architecture

Vesh AI follows a pipeline architecture with five processing stages. Each stage is independently observable and produces auditable artifacts.

architecture-overview

Stage 1 — Ingestion

Read-only connectors extract records from Stripe, PostgreSQL, MySQL, and CRM systems.

Output: raw_records table (source, external_id, payload, extracted_at)

↓

Stage 2 — Entity Resolution

Blocking → scoring → clustering → canonical entity computation with confidence scores.

Output: canonical_entities, entity_links (confidence, method, source_pair)

↓

Stage 3 — Metric Computation

MRR, churn, NRR, expansion, contraction, ARPU computed from canonical entity data.

Output: metric_snapshots (metric_id, date, value, confidence, record_count)

↓

Stage 4 — Anomaly Detection

Statistical detection (z-score, rate-of-change) with LLM causal decomposition.

Output: insights (severity, confidence, explanation, lineage references)

↓

Stage 5 — Delivery

Daily briefs, anomaly alerts, and weekly digests delivered to Slack.

Output: Slack messages with thread follow-up capability

Technology Stack

API

FastAPI (Python 3.12)

Task Queue

Celery + Redis

Database

PostgreSQL 16

Frontend

Next.js 16 + React 19

LLM

DeepSeek via LiteLLM

Monitoring

Prometheus + Grafana

Container

Docker Compose

CI/CD

GitHub Actions

Installation

The open-source framework is distributed as a Python package:

terminal

pip install vesh-agents

Optional Extras

terminal

pip install vesh-agents[stripe]   # Stripe connector
pip install vesh-agents[postgres] # PostgreSQL connector
pip install vesh-agents[all]      # All connectors

Python 3.10 or higher is required.

SDK Quick Start

Get from zero to revenue insights in three steps.

Install

Install the package and optional connectors for your data source.

Analyze

Run the CLI to analyze a CSV file, Stripe account, or PostgreSQL database.

Explore

Use the SDK in Python for natural language analysis with the revenue orchestrator.

terminal

vesh analyze csv data.csv

python

import asyncio
from agents import Runner
from vesh_agents.verticals.revenue import create_revenue_orchestrator

async def main():
    orchestrator = create_revenue_orchestrator(model="litellm/deepseek/deepseek-chat")
    result = await Runner.run(
        orchestrator,
        "Analyze the revenue data in data.csv. Compute all SaaS metrics and identify any anomalies.",
    )
    print(result.final_output)

asyncio.run(main())

Agents

Six specialized AI agents work together in a pipeline. Each has a distinct role and set of tools.

Agent	Role	Tools
DataConnector	Extracts data from CSV, Stripe, PostgreSQL	import_csv, extract_stripe, extract_postgres
EntityResolver	Matches records across sources to identify same entities	resolve_entities
MetricComputer	Computes SaaS metrics (MRR, churn, ARPU, NRR, etc.)	compute_saas_metrics, list_available_metrics
AnomalyDetector	Finds statistical anomalies in metric time series	detect_anomalies
InsightReasoner	Explains WHY anomalies occurred with causal reasoning	explain_anomaly
Orchestrator	Coordinates the team of specialized agents	handoffs to all agents

CLI Reference

The vesh CLI provides commands for data analysis, interactive AI chat, and MCP server management. Requires Python ≥ 3.11.

Data Analysis

vesh analyze csv <file>

Analyze a CSV file. Runs the full pipeline: extract → resolve entities → compute metrics → detect anomalies.

vesh analyze stripe --api-key <key>

Analyze live Stripe data. Connects read-only via the Stripe API.

vesh analyze postgres --host <host> --database <db> --user <user> --password <pw>

Analyze data from a Postgres database.

AI & Chat

vesh run "<question>"

Run natural language analysis. Supports --source and --model flags.

vesh chat

Open an interactive AI chat for revenue analysis.

vesh setup

One-time setup: installs OpenCode, writes opencode.json.

Cloud & Integration

vesh login <key>

Authenticate with Vesh AI cloud. Saves API key to ~/.vesh/config.

vesh mcp serve

Start the MCP server (stdio transport) for Cursor, Claude Desktop, etc.

BYOM Configuration

Bring Your Own Model: the framework uses LiteLLM for model abstraction, supporting any compatible provider.

model-examples

litellm/deepseek/deepseek-chat
litellm/anthropic/claude-sonnet-4-20250514
litellm/openai/gpt-4o

Usage

python

from vesh_agents.verticals.revenue import create_revenue_orchestrator

orchestrator = create_revenue_orchestrator(
    model="litellm/anthropic/claude-sonnet-4-20250514"
)

Quick Start

Get from zero to first insight in three steps. Setup takes under 30 minutes; your first automated insight arrives within 3 hours.

Create a connection

Provide read-only credentials. Vesh encrypts them with AES-256 and never stores raw passwords.

create-connection.sh

curl -X POST https://api.veshai.com/api/v1/connections \
  -H "Authorization: Bearer $TOKEN" \
  -H "X-Tenant-Id: $TENANT_ID" \
  -d '{
    "name": "Production Stripe",
    "source_type": "stripe",
    "credentials": { "api_key": "rk_live_..." }
  }'

Trigger the pipeline

The pipeline runs automatically, or trigger it manually.

trigger-pipeline.sh

curl -X POST https://api.veshai.com/api/v1/pipeline/run \
  -H "Authorization: Bearer $TOKEN" \
  -H "X-Tenant-Id: $TENANT_ID"

Receive insights in Slack

Configure your Slack workspace. Daily briefs and anomaly alerts start flowing automatically.

Data Connectors

Connectors are read-only integrations. All credentials are encrypted with AES-256. Vesh never writes to your source systems.

Stripe

Live

Customers, subscriptions, invoices, charges, refunds

Auth: Restricted API key (read-only)

PostgreSQL

Live

Any table or view via SQL query

Auth: Connection string (read-only user)

MySQL

Live

Any table or view via SQL query

Auth: Connection string (read-only user)

HubSpot

Roadmap

Contacts, companies, deals

Auth: OAuth 2.0 or API key

Salesforce

Roadmap

Accounts, opportunities, contacts

Auth: OAuth 2.0

QuickBooks

Roadmap

Invoices, payments, customers

Auth: OAuth 2.0

Entity Resolution

The same customer appears differently across systems. Vesh resolves these into a single canonical entity graph.

Blocking

Generates candidate pairs by partitioning records into blocks based on shared attributes (email domain, company name prefix). Reduces the O(n²) comparison space to O(n·k).

Scoring

Each candidate pair is scored using weighted field-level similarities: exact match, fuzzy string (Jaro-Winkler), email normalization, and domain matching. Scores range from 0 to 1.

Clustering

Pairs above the confidence threshold (default 0.5) are clustered using connected components. Each cluster becomes a canonical entity.

Canonical Computation

A canonical entity record is computed by merging fields from all source records. Conflicts are resolved by source priority and recency.

entity-resolution-output.json

{
  "entity_id": "ent_00472",
  "canonical_name": "Acme Corp",
  "links": [
    { "source": "stripe",   "external_id": "cus_NhJ8kLm", "confidence": 0.98 },
    { "source": "postgres", "external_id": "user_4871",   "confidence": 0.91 }
  ],
  "computed_fields": { "mrr": 4230, "plan": "growth", "nrr": 0.89 }
}

Metric Engine

Metrics are computed from canonical entity data using a formal metric ontology — ensuring every team sees the same numbers, computed the same way.

Built-in Metrics

MRR

Net New MRR

Churn MRR

Expansion MRR

Contraction MRR

Logo Churn

Net Revenue Retention

ARPU

Customer Count

Active Entities

Metrics are computed daily and stored as time-series snapshots with confidence scores based on entity resolution quality.

Anomaly Detection

The anomaly detection engine runs after metric computation and identifies statistically significant deviations, then generates causal explanations.

Z-Score

Compares today's value against a rolling 30-day distribution. Triggers at |z| > 2.0 by default.

Rate of Change

Detects acceleration or deceleration in metric trends. Triggers when day-over-day change exceeds 2× the rolling standard deviation.

Seasonal Decomposition

Accounts for weekly and monthly seasonality patterns to avoid false positives on expected fluctuations.

LLM Causal Analysis

When a statistical anomaly is detected, the LLM examines correlated metrics to generate a causal hypothesis with confidence.

insight-example.json

{
  "insight_id": "ins_8f3a2b",
  "severity": 0.87,
  "confidence": 0.94,
  "metric": "churn_mrr",
  "headline": "Enterprise churn spiked 2.4× vs 30-day average",
  "explanation": "3 enterprise accounts using Feature X < 3×/week churned simultaneously."
}

Delivery Channels

Vesh delivers insights proactively. Intelligence arrives where your team works — no dashboards to check.

Daily Brief

Every day at 8:00 AM

Key metrics summary, notable changes, and any anomalies. Rich Slack message with inline charts.

Anomaly Alerts

Real-time

Triggered immediately when the detection engine identifies a statistically significant deviation.

Weekly Digest

Mondays at 9:00 AM

Synthesis of weekly trends, momentum indicators, and emerging risks.

Thread Follow-up

On demand

Reply in a Slack thread to ask follow-up questions. Vesh responds with context and drill-downs.

API Reference

REST API with JSON request/response bodies. All endpoints require authentication via JWT bearer token or API key.

JWT Bearer Token

Obtained from POST /api/v1/auth/login. Recommended for UI flows.

API Key

Include as X-API-Key: <key> header. Recommended for CI/CD.

Core Endpoints

POST

/api/v1/auth/login

Authenticate and receive JWT token

Auth: None (public)

{ "access_token": "eyJ...", "token_type": "bearer" }

GET

/api/v1/connections

List all data source connections for the tenant

Auth: JWT or API Key

POST

/api/v1/connections

Create a new data source connection

Auth: JWT or API Key

GET

/api/v1/entities

List resolved canonical entities with pagination

Auth: JWT or API Key

GET

/api/v1/metrics

List metric definitions and latest values

Auth: JWT or API Key

GET

/api/v1/metrics/{id}/history

Get metric time series (daily snapshots)

Auth: JWT or API Key

GET

/api/v1/insights

List anomaly insights with severity and confidence

Auth: JWT or API Key

POST

/api/v1/pipeline/run

Trigger the full pipeline (ingest → resolve → compute → detect)

Auth: JWT or API Key

GET

/api/v1/health

Service health check

Auth: None (public)

{ "status": "healthy", "version": "1.0.0" }

Security & Compliance

Every layer — from credential storage to API access to data delivery — is designed with defense in depth.

Encryption at Rest

All credentials encrypted with AES-256 using per-tenant keys. Database encryption via PostgreSQL TDE.

Encryption in Transit

All API communication over TLS 1.3. Internal service communication encrypted via Docker network isolation.

Read-Only Access

Connectors use read-only database users and restricted API keys. Vesh never writes to source systems.

Tenant Isolation

Strict tenant isolation at the database level. Every query is scoped by tenant_id.

Authentication

JWT-based authentication with bcrypt password hashing. API key support for programmatic access.

Audit Logging

All API requests logged with timestamp, tenant, user, action, and resource. Full lineage trail for every insight.

Deployment Model

Vesh AI deploys as a set of Docker containers orchestrated by Docker Compose. The standard deployment includes 6 services.

docker-compose.yml

services:
  api:        # FastAPI application server
  worker:     # Celery worker for async tasks
  beat:       # Celery Beat scheduler (daily briefs, pipeline)
  db:         # PostgreSQL 16 database
  redis:      # Redis 7 (task queue + caching)
  prometheus: # Metrics collection
  grafana:    # Operational dashboards

Managed (Recommended)

Vesh hosts and manages the infrastructure. Zero ops overhead.

Self-Hosted

Deploy in your own infrastructure via Docker Compose on any Linux server.

Hybrid

Data stays in your infrastructure. Vesh processing runs in our managed environment.

Technical FAQ

Ready to evaluate?

Start a 14-day pilot with your own data. See entity resolution, metric computation, and anomaly detection working on real records.

Start Free Pilot Talk to engineering

Technical Documentation

Overview

No data warehouse required

Trust by default

Proactive delivery

System Architecture

Technology Stack

Installation

Optional Extras

SDK Quick Start

Install

Analyze

Explore

Agents

CLI Reference

Data Analysis

AI & Chat

Cloud & Integration

BYOM Configuration

Usage

Quick Start

Create a connection

Trigger the pipeline

Receive insights in Slack

Data Connectors

Stripe

PostgreSQL

MySQL

HubSpot

Salesforce

QuickBooks

Entity Resolution

Blocking

Scoring

Clustering

Canonical Computation

Metric Engine

Built-in Metrics

Anomaly Detection

Z-Score

Rate of Change

Seasonal Decomposition

LLM Causal Analysis

Delivery Channels

Daily Brief

Anomaly Alerts

Weekly Digest

Thread Follow-up

API Reference

JWT Bearer Token

API Key

Core Endpoints

Security & Compliance

Encryption at Rest

Encryption in Transit

Read-Only Access

Tenant Isolation

Authentication

Audit Logging

Deployment Model

Managed (Recommended)

Self-Hosted

Hybrid

Technical FAQ

How does Vesh handle schema changes in source systems?

What happens if entity resolution makes a mistake?

How much historical data does Vesh process?

Can multiple tenants share a single deployment?

What LLM does the reasoning engine use?

How do you ensure metric accuracy?

Ready to evaluate?