Configuration

Configure the LLM Gateway API using environment variables in a .env file.

Environment Variables

Provider API Keys

Configure API keys for the LLM providers you want to use. You only need to provide keys for the providers you plan to use.

OpenAI

OPENAI_API_KEY=sk-proj-...

Get your API key from OpenAI Platform.

Groq

GROQ_API_KEY=gsk_...

Get your API key from Groq Console.

Google

GOOGLE_API_KEY=AIza...

Get your API key from Google AI Studio.

Local LLM Server

For OpenAI-compatible local LLM servers:

STIP_API_KEY=your_local_api_key  # Optional, if your server requires auth
LOCAL_LLM_API_URL=http://localhost:8001  # Default: http://localhost:8000

The local provider uses the OpenAI client library but points to a custom base URL. Your local server must be compatible with the OpenAI API format.

Authentication

API Key Authentication

Enable API key authentication by setting a secret token:

SECRET_TOKEN=your_secure_secret_token

When set, all requests must include the X-API-Key header with this token.

Without authentication:

# SECRET_TOKEN not set or empty
# All requests are allowed without authentication

With authentication:

SECRET_TOKEN=my-secret-123
# All requests require: X-API-Key: my-secret-123

If SECRET_TOKEN is not set, the API will be accessible without authentication. Always set this in production environments.

Docker Configuration

Development Mode

Use docker-compose.dev.yaml for local development:

services:
  llm-gateway:
    image: llm-gateway:latest
    ports:
      - "8000:8000"
    volumes:
      - ./:/app/logs
    env_file:
      - .env

Start with:

docker-compose -f docker-compose.dev.yaml up

Port Configuration

To change the port, modify the port mapping:

ports:
  - "8080:8000"  # External:Internal

Volume Mounts

Logs are written to the logs/ directory, which is mounted as a volume:

volumes:
  - ./logs:/app/logs  # Host:Container

Model Configuration

Default Settings

Default values are applied when not specified in requests:

Temperature: 0.2 (deterministic output)
Max Tokens: 4096

Model Selection

Models are specified in the format provider/model_name:

{
  "model": "openai/gpt-4o"
}

Popular models:

OpenAI: openai/gpt-4o, openai/gpt-3.5-turbo
Groq: groq/llama-3.3-70b-versatile, groq/mixtral-8x7b-32768
Google: google/gemini-pro
Local: local/your-model-name

Logging

Log Location

Logs are written to logs/llm.log in the project directory.

View Logs

# Real-time log monitoring
tail -f logs/llm.log

# View recent logs
tail -n 100 logs/llm.log

Log Format

Logs include:

Timestamp
Log level (INFO, ERROR, etc.)
Component name
Message

Example:

2024-02-13 10:30:45 INFO [llm_service] LLMService initialized
2024-02-13 10:30:50 ERROR [llm_service] ModelHTTPError during run: Invalid API key

Performance Tuning

Agent Caching

The service uses LRU caching for agent instances (max 32 cached agents). Agents are cached based on:

Provider
Model name
System prompt
Temperature
Max tokens
Output type

This improves performance for repeated requests with the same configuration.

Retries

All LLM requests automatically retry up to 3 times on failure before returning an error.

Environment-Specific Configuration

The repository may include additional Docker Compose files for different environments (staging, production). Check the root directory for files like docker-compose.staging.yaml.

Development

docker-compose -f docker-compose.dev.yaml up

Staging (if available)

docker-compose -f docker-compose.staging.yaml up

Security Best Practices

Never commit .env files to version control
Use strong SECRET_TOKEN values in production
Rotate API keys regularly
Restrict network access to the API in production
Monitor logs for unusual activity
Use HTTPS in production deployments

Configuration

On this page