Enterprise AI Platformv1.0

WaddleAIProxy Platform

Enterprise-grade AI proxy powered by MarchProxy AILB, unifying 7 providers — OpenAI, Anthropic, Google Gemini, AWS Bedrock, Azure OpenAI, Cohere, and Ollama — behind a single OpenAI-compatible API with VS Code integration, OpenWebUI interface, intelligent load balancing, automatic failover, and comprehensive token management.

Get Started View Documentation

99.9%

Uptime

100%

Security Scanned

50ms

Avg Latency

Terminal

$pip install waddleai

✓ Installing WaddleAI...

# Start with Docker Compose

$docker-compose -f docker-compose.testing.yml up

✓ WaddleAI + OpenWebUI running

# Use in VS Code Chat

@waddleai Help me write a REST API

✓ Context-aware AI assistance

# Or use OpenAI client

client = OpenAI(

base_url="http://localhost:8000/v1"

)

VS Code Ready

OpenWebUI Included

OpenAI Compatible

Why Choose WaddleAI?

Enterprise-grade AI proxy that provides OpenAI-compatible APIs with advanced routing, security, and management capabilities for organizations of all sizes.

MarchProxy AILB Sync

Auto-sync provider routes to MarchProxy AILB for intelligent load balancing across OpenAI, Anthropic, Gemini, Bedrock, Azure, Cohere, and Ollama.

Model Aliases

Friendly model aliases like chatgpt → gpt-4o, claude → claude-3-5-sonnet-latest. Simplify requests while routing to the right provider.

Health Monitoring & Failover

Continuous health checks across all 7 providers with automatic failover. Unhealthy endpoints are bypassed seamlessly.

Usage Tracking

LiteLLM-style token counting with per-request cost estimation. Track prompt, completion, and total tokens across every provider.

Budget Enforcement

Set daily and monthly spending limits per key or organization. Automatic cutoff when budgets are reached with configurable alerts.

Virtual Key Management

Issue virtual API keys with granular permissions, model access restrictions, and per-key budget caps for fine-grained control.

Rate Limiting

Configurable RPM and TPM limits per virtual key. Protect upstream providers from abuse while ensuring fair resource allocation.

Multi-Tenancy

Full organization-based isolation with separate keys, budgets, and usage tracking. Each tenant operates independently.

RBAC Access Control

Role-based access with Admin, Maintainer, and Viewer roles. Control who can manage keys, view usage, or modify configurations.

Audit Logging

Comprehensive audit trail for every API request, key operation, and configuration change. Full accountability and compliance.

Webhook Events

Receive real-time webhook events from MarchProxy AILB for provider health changes, budget alerts, and rate limit triggers.

Advanced Security

Prompt injection detection, jailbreak prevention, and security scanning layered on top of MarchProxy AILB routing.

How It Works

Simple integration with powerful features under the hood

Deploy WaddleAI

Set up WaddleAI proxy and management servers in your infrastructure using Docker or Kubernetes.

Configure Providers

Connect your OpenAI, Anthropic, and Ollama providers through the management interface.

Start Building

Use in VS Code with @waddleai, OpenWebUI for testing, or the OpenAI-compatible API in applications.

Multiple Ways to Integrate

Choose the integration method that works best for your workflow

VS Code Extension

Native chat participant integration with full workspace context awareness and streaming responses.

Learn More →

OpenWebUI

Modern web interface for testing models, managing conversations, and exploring AI capabilities.

Learn More →

OpenAI API

Drop-in replacement for OpenAI API with enhanced security, routing, and enterprise features.

Learn More →

Ready to Get Started?

Deploy WaddleAI in minutes and start managing your AI infrastructure today.

View Documentation GitHub Repository

How WaddleAI Processes Requests

Interactive dataflow showing how requests move through WaddleAI's architecture

Choose Integration Method

User Input

Authentication & Security

Intelligent Routing

LLM Providers

Response Processing

User Response

VS Code Extension

@waddleai chat participant with context awareness

< 50ms

Avg Latency

100%

Security Scanned

99.9%

Uptime SLA

10K+

Requests/Min

Deployment Architecture Scenarios

Choose your deployment strategy based on scale, complexity, and requirements

WaddleAI Proxy

Management Server

OpenWebUI

PostgreSQL

Redis

Development Setup

Perfect for local development and testing

Single-command deployment
All services included
Easy configuration

Resource Requirements

Minimal hardware requirements

8GB RAM minimum
4 CPU cores
100GB storage

Use Cases

Ideal scenarios for Docker deployment

Local development
Small team testing
Proof of concept

WaddleAIProxy Platform

Why Choose WaddleAI?

MarchProxy AILB Sync

Model Aliases

Health Monitoring & Failover

Usage Tracking

Budget Enforcement

Virtual Key Management

Rate Limiting

Multi-Tenancy

RBAC Access Control

Audit Logging

Webhook Events

Advanced Security

How It Works

Deploy WaddleAI

Configure Providers

Start Building

Multiple Ways to Integrate

VS Code Extension

OpenWebUI

OpenAI API

Ready to Get Started?

How WaddleAI Processes Requests

Choose Integration Method

User Input

Authentication & Security

Intelligent Routing

LLM Providers

Response Processing

User Response

VS Code Extension

Deployment Architecture Scenarios

Docker Compose

Kubernetes

Cloud Native

WaddleAI Proxy

Management Server

OpenWebUI

PostgreSQL

Redis

Development Setup

Resource Requirements

Use Cases