Enterprise AI Platformv1.0

WaddleAIProxy Platform

Enterprise-grade AI proxy powered by MarchProxy AILB, unifying 7 providers — OpenAI, Anthropic, Google Gemini, AWS Bedrock, Azure OpenAI, Cohere, and Ollama — behind a single OpenAI-compatible API with VS Code integration, OpenWebUI interface, intelligent load balancing, automatic failover, and comprehensive token management.

99.9%
Uptime
100%
Security Scanned
50ms
Avg Latency
Terminal
$pip install waddleai
✓ Installing WaddleAI...
# Start with Docker Compose
$docker-compose -f docker-compose.testing.yml up
✓ WaddleAI + OpenWebUI running
# Use in VS Code Chat
@waddleai Help me write a REST API
✓ Context-aware AI assistance
# Or use OpenAI client
client = OpenAI(
base_url="http://localhost:8000/v1"
)
VS Code Ready
OpenWebUI Included
OpenAI Compatible

Why Choose WaddleAI?

Enterprise-grade AI proxy that provides OpenAI-compatible APIs with advanced routing, security, and management capabilities for organizations of all sizes.

MarchProxy AILB Sync

Auto-sync provider routes to MarchProxy AILB for intelligent load balancing across OpenAI, Anthropic, Gemini, Bedrock, Azure, Cohere, and Ollama.

Model Aliases

Friendly model aliases like chatgpt → gpt-4o, claude → claude-3-5-sonnet-latest. Simplify requests while routing to the right provider.

Health Monitoring & Failover

Continuous health checks across all 7 providers with automatic failover. Unhealthy endpoints are bypassed seamlessly.

Usage Tracking

LiteLLM-style token counting with per-request cost estimation. Track prompt, completion, and total tokens across every provider.

Budget Enforcement

Set daily and monthly spending limits per key or organization. Automatic cutoff when budgets are reached with configurable alerts.

Virtual Key Management

Issue virtual API keys with granular permissions, model access restrictions, and per-key budget caps for fine-grained control.

Rate Limiting

Configurable RPM and TPM limits per virtual key. Protect upstream providers from abuse while ensuring fair resource allocation.

Multi-Tenancy

Full organization-based isolation with separate keys, budgets, and usage tracking. Each tenant operates independently.

RBAC Access Control

Role-based access with Admin, Maintainer, and Viewer roles. Control who can manage keys, view usage, or modify configurations.

Audit Logging

Comprehensive audit trail for every API request, key operation, and configuration change. Full accountability and compliance.

Webhook Events

Receive real-time webhook events from MarchProxy AILB for provider health changes, budget alerts, and rate limit triggers.

Advanced Security

Prompt injection detection, jailbreak prevention, and security scanning layered on top of MarchProxy AILB routing.

How It Works

Simple integration with powerful features under the hood

1

Deploy WaddleAI

Set up WaddleAI proxy and management servers in your infrastructure using Docker or Kubernetes.

2

Configure Providers

Connect your OpenAI, Anthropic, and Ollama providers through the management interface.

3

Start Building

Use in VS Code with @waddleai, OpenWebUI for testing, or the OpenAI-compatible API in applications.

Multiple Ways to Integrate

Choose the integration method that works best for your workflow

VS Code Extension

Native chat participant integration with full workspace context awareness and streaming responses.

Learn More →

OpenWebUI

Modern web interface for testing models, managing conversations, and exploring AI capabilities.

Learn More →

OpenAI API

Drop-in replacement for OpenAI API with enhanced security, routing, and enterprise features.

Learn More →

Ready to Get Started?

Deploy WaddleAI in minutes and start managing your AI infrastructure today.

How WaddleAI Processes Requests

Interactive dataflow showing how requests move through WaddleAI's architecture

Choose Integration Method

User Input

Authentication & Security

Intelligent Routing

LLM Providers

Response Processing

User Response

VS Code Extension

@waddleai chat participant with context awareness

< 50ms
Avg Latency
100%
Security Scanned
99.9%
Uptime SLA
10K+
Requests/Min

Deployment Architecture Scenarios

Choose your deployment strategy based on scale, complexity, and requirements

WaddleAI Proxy

Management Server

OpenWebUI

PostgreSQL

Redis

Development Setup

Perfect for local development and testing

  • Single-command deployment
  • All services included
  • Easy configuration

Resource Requirements

Minimal hardware requirements

  • 8GB RAM minimum
  • 4 CPU cores
  • 100GB storage

Use Cases

Ideal scenarios for Docker deployment

  • Local development
  • Small team testing
  • Proof of concept