WaddleAIProxy Platform
Enterprise-grade AI proxy powered by MarchProxy AILB, unifying 7 providers — OpenAI, Anthropic, Google Gemini, AWS Bedrock, Azure OpenAI, Cohere, and Ollama — behind a single OpenAI-compatible API with VS Code integration, OpenWebUI interface, intelligent load balancing, automatic failover, and comprehensive token management.
Why Choose WaddleAI?
Enterprise-grade AI proxy that provides OpenAI-compatible APIs with advanced routing, security, and management capabilities for organizations of all sizes.
MarchProxy AILB Sync
Auto-sync provider routes to MarchProxy AILB for intelligent load balancing across OpenAI, Anthropic, Gemini, Bedrock, Azure, Cohere, and Ollama.
Model Aliases
Friendly model aliases like chatgpt → gpt-4o, claude → claude-3-5-sonnet-latest. Simplify requests while routing to the right provider.
Health Monitoring & Failover
Continuous health checks across all 7 providers with automatic failover. Unhealthy endpoints are bypassed seamlessly.
Usage Tracking
LiteLLM-style token counting with per-request cost estimation. Track prompt, completion, and total tokens across every provider.
Budget Enforcement
Set daily and monthly spending limits per key or organization. Automatic cutoff when budgets are reached with configurable alerts.
Virtual Key Management
Issue virtual API keys with granular permissions, model access restrictions, and per-key budget caps for fine-grained control.
Rate Limiting
Configurable RPM and TPM limits per virtual key. Protect upstream providers from abuse while ensuring fair resource allocation.
Multi-Tenancy
Full organization-based isolation with separate keys, budgets, and usage tracking. Each tenant operates independently.
RBAC Access Control
Role-based access with Admin, Maintainer, and Viewer roles. Control who can manage keys, view usage, or modify configurations.
Audit Logging
Comprehensive audit trail for every API request, key operation, and configuration change. Full accountability and compliance.
Webhook Events
Receive real-time webhook events from MarchProxy AILB for provider health changes, budget alerts, and rate limit triggers.
Advanced Security
Prompt injection detection, jailbreak prevention, and security scanning layered on top of MarchProxy AILB routing.
How It Works
Simple integration with powerful features under the hood
Deploy WaddleAI
Set up WaddleAI proxy and management servers in your infrastructure using Docker or Kubernetes.
Configure Providers
Connect your OpenAI, Anthropic, and Ollama providers through the management interface.
Start Building
Use in VS Code with @waddleai, OpenWebUI for testing, or the OpenAI-compatible API in applications.
Multiple Ways to Integrate
Choose the integration method that works best for your workflow
VS Code Extension
Native chat participant integration with full workspace context awareness and streaming responses.
Learn More →OpenWebUI
Modern web interface for testing models, managing conversations, and exploring AI capabilities.
Learn More →OpenAI API
Drop-in replacement for OpenAI API with enhanced security, routing, and enterprise features.
Learn More →Ready to Get Started?
Deploy WaddleAI in minutes and start managing your AI infrastructure today.
How WaddleAI Processes Requests
Interactive dataflow showing how requests move through WaddleAI's architecture
Choose Integration Method
User Input
Authentication & Security
Intelligent Routing
LLM Providers
Response Processing
User Response
VS Code Extension
@waddleai chat participant with context awareness
Deployment Architecture Scenarios
Choose your deployment strategy based on scale, complexity, and requirements
WaddleAI Proxy
Management Server
OpenWebUI
PostgreSQL
Redis
Development Setup
Perfect for local development and testing
- Single-command deployment
- All services included
- Easy configuration
Resource Requirements
Minimal hardware requirements
- 8GB RAM minimum
- 4 CPU cores
- 100GB storage
Use Cases
Ideal scenarios for Docker deployment
- Local development
- Small team testing
- Proof of concept