Skip to content

A lightweight proxy for LLM API calls with guardrails, metrics, and monitoring. A vibe coding experiment.

License

Notifications You must be signed in to change notification settings

charmitro/ai-proxy

Repository files navigation

Important

Vibe code experiment.

AI Proxy

A lightweight proxy for LLM interactions with basic guardrails, logging, and metrics.

Features

  • Single LLM provider support (OpenAI)
  • Basic guardrail for banned words
  • Logging of requests and responses
  • Prometheus metrics
  • Config-driven setup
  • Docker deployment
  • Monitoring with Prometheus and Grafana

Configuration

The proxy can be configured using a YAML file or environment variables:

Config File (config/config.yaml)

server:
  port: 8080

llm:
  url: "https://api.openai.com/v1/chat/completions"
  api_key: "YOUR_OPENAI_API_KEY"

guardrails:
  banned_words:
    - "bomb"
    - "attack"

Environment Variables

  • SERVER_PORT: Server port (default: 8080)
  • LLM_URL: LLM API endpoint URL
  • LLM_API_KEY: LLM API key
  • BANNED_WORDS: Comma-separated list of banned words

API

Query Endpoint

Request:

{
  "prompt": "Your prompt to the LLM",
  "model_params": {
    "model": "gpt-3.5-turbo",
    "temperature": 0.7,
    "max_tokens": 256
  }
}

Response:

{
  "completion": "LLM response text..."
}

Metrics Endpoint

GET /metrics

Returns Prometheus-formatted metrics including:

  • llm_requests_total: Total number of LLM requests processed
  • llm_errors_total: Total number of errors from LLM calls
  • llm_tokens_total: Total number of tokens used in LLM calls
  • guardrail_blocks_total: Total number of requests blocked by guardrails

Health Check

GET /health

Running Locally

# Build and run
go build -o ai-proxy ./cmd/server
./ai-proxy --config config/config.yaml

Running with Docker

# Build Docker image
docker build -t ai-proxy:0.1 .

# Run with configuration in environment variables
docker run -p 8080:8080 \
  -e LLM_API_KEY=your_openai_api_key \
  ai-proxy:0.1

# Or mount a custom config file
docker run -p 8080:8080 \
  -v $(pwd)/config/config.yaml:/app/config/config.yaml \
  ai-proxy:0.1

Monitoring Setup

The project includes a complete monitoring stack with Prometheus and Grafana.

Running with Monitoring

# Start the entire stack (AI Proxy, Prometheus, and Grafana)
docker-compose up -d

# Access the services:
# - AI Proxy: http://localhost:8080
# - Prometheus: http://localhost:9090
# - Grafana: http://localhost:3000 (login with admin/admin)

# Stop the services
docker-compose down

Example Usage

# Send a query
curl -X POST http://localhost:8080/v1/query \
  -H "Content-Type: application/json" \
  -d '{"prompt": "Tell me a joke", "model_params": {"temperature": 0.7}}'

# Check metrics
curl http://localhost:8080/metrics

Potential Enhancements

This MVP focuses on core functionality. Some possible future enhancements could include:

  1. Enhanced Guardrails

    • More sophisticated content filtering options
    • Support for custom filtering rules
  2. Additional LLM Support

    • Integration with other LLM providers
  3. Simple Authentication

    • Basic rate limiting
  4. Performance Improvements

    • Optional caching for common queries
    • Optimizations for high-traffic scenarios

About

A lightweight proxy for LLM API calls with guardrails, metrics, and monitoring. A vibe coding experiment.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published