AI Extension

Comprehensive AI platform with LLM integration, agents, inference engine, and model management

AI Extension

The AI extension is Forge's most comprehensive extension, providing a complete AI platform with LLM integration, AI agents, high-performance inference engine, model management, and advanced features like streaming, monitoring, and smart caching.

The AI extension supports multiple LLM providers including OpenAI, Anthropic, Azure OpenAI, Ollama, and HuggingFace with unified interfaces.

Features

LLM Integration

  • Multiple Providers: OpenAI, Anthropic, Azure OpenAI, Ollama, HuggingFace
  • Unified Interface: Consistent API across all providers
  • Streaming Support: Real-time response streaming
  • Function Calling: Tool use and function execution
  • Vision Models: Image understanding and analysis

AI Agents

  • Agent Framework: Build autonomous AI agents
  • Tool Integration: Connect agents to external tools
  • Memory Management: Persistent agent memory
  • Multi-Agent Systems: Coordinate multiple agents
  • Workflow Orchestration: Complex agent workflows

Inference Engine

  • High Performance: Optimized for production workloads
  • Model Formats: ONNX, PyTorch, TensorFlow support
  • GPU Acceleration: CUDA and Metal support
  • Batch Processing: Efficient batch inference
  • Model Serving: REST and gRPC endpoints

Model Management

  • Model Registry: Centralized model storage
  • Version Control: Model versioning and rollback
  • A/B Testing: Compare model performance
  • Auto-scaling: Dynamic model scaling
  • Health Monitoring: Model performance tracking

Installation

go get github.com/xraph/forge/extensions/ai

Configuration

extensions:
  ai:
    # LLM Configuration
    llm:
      default_provider: "openai"
      providers:
        openai:
          api_key: "${OPENAI_API_KEY}"
          model: "gpt-4"
          max_tokens: 4096
          temperature: 0.7
        anthropic:
          api_key: "${ANTHROPIC_API_KEY}"
          model: "claude-3-sonnet-20240229"
          max_tokens: 4096
        azure:
          api_key: "${AZURE_OPENAI_API_KEY}"
          endpoint: "${AZURE_OPENAI_ENDPOINT}"
          deployment: "gpt-4"
        ollama:
          endpoint: "http://localhost:11434"
          model: "llama2"
    
    # Inference Engine
    inference:
      enabled: true
      gpu_enabled: true
      max_batch_size: 32
      timeout: "30s"
      models_path: "./models"
    
    # Agents
    agents:
      enabled: true
      max_agents: 100
      memory_backend: "redis"
      tools_enabled: true
    
    # Caching
    cache:
      enabled: true
      backend: "redis"
      ttl: "1h"
      max_size: "1GB"
    
    # Monitoring
    monitoring:
      enabled: true
      metrics_interval: "10s"
      log_requests: true
      trace_enabled: true
# LLM Provider Keys
export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."
export AZURE_OPENAI_API_KEY="..."
export AZURE_OPENAI_ENDPOINT="https://..."

# AI Extension Config
export AI_DEFAULT_PROVIDER="openai"
export AI_INFERENCE_ENABLED="true"
export AI_GPU_ENABLED="true"
export AI_CACHE_ENABLED="true"
export AI_MONITORING_ENABLED="true"
package main

import (
    "github.com/xraph/forge"
    "github.com/xraph/forge/extensions/ai"
)

func main() {
    app := forge.New()

    // Configure AI extension
    aiExt := ai.New(ai.Config{
        LLM: ai.LLMConfig{
            DefaultProvider: "openai",
            Providers: map[string]ai.ProviderConfig{
                "openai": {
                    APIKey: "sk-...",
                    Model:  "gpt-4",
                    MaxTokens: 4096,
                    Temperature: 0.7,
                },
            },
        },
        Inference: ai.InferenceConfig{
            Enabled: true,
            GPUEnabled: true,
            MaxBatchSize: 32,
        },
        Agents: ai.AgentsConfig{
            Enabled: true,
            MaxAgents: 100,
            MemoryBackend: "redis",
        },
    })

    app.RegisterExtension(aiExt)
    app.Run()
}

Usage Examples

Basic LLM Usage

func chatHandler(c forge.Context) error {
    ai := forge.GetAI(c)
    
    var req struct {
        Message string `json:"message"`
    }
    if err := c.Bind(&req); err != nil {
        return c.JSON(400, map[string]string{"error": "Invalid request"})
    }
    
    response, err := ai.Chat(c.Context(), ai.ChatRequest{
        Messages: []ai.Message{
            {Role: "user", Content: req.Message},
        },
        Model: "gpt-4",
        MaxTokens: 1000,
    })
    if err != nil {
        return c.JSON(500, map[string]string{"error": err.Error()})
    }
    
    return c.JSON(200, map[string]string{
        "response": response.Content,
    })
}
func streamChatHandler(c forge.Context) error {
    ai := forge.GetAI(c)
    
    // Set up SSE headers
    c.Response().Header().Set("Content-Type", "text/event-stream")
    c.Response().Header().Set("Cache-Control", "no-cache")
    c.Response().Header().Set("Connection", "keep-alive")
    
    stream, err := ai.ChatStream(c.Context(), ai.ChatRequest{
        Messages: []ai.Message{
            {Role: "user", Content: "Tell me a story"},
        },
        Model: "gpt-4",
        Stream: true,
    })
    if err != nil {
        return err
    }
    defer stream.Close()
    
    for chunk := range stream.Channel() {
        if chunk.Error != nil {
            break
        }
        
        data := fmt.Sprintf("data: %s\n\n", chunk.Content)
        if _, err := c.Response().Write([]byte(data)); err != nil {
            break
        }
        c.Response().Flush()
    }
    
    return nil
}
func functionCallHandler(c forge.Context) error {
    ai := forge.GetAI(c)
    
    // Define available functions
    tools := []ai.Tool{
        {
            Name: "get_weather",
            Description: "Get current weather for a location",
            Parameters: ai.ToolParameters{
                Type: "object",
                Properties: map[string]ai.Property{
                    "location": {
                        Type: "string",
                        Description: "City name",
                    },
                },
                Required: []string{"location"},
            },
            Handler: func(args map[string]interface{}) (interface{}, error) {
                location := args["location"].(string)
                // Call weather API
                return map[string]interface{}{
                    "temperature": 22,
                    "condition": "sunny",
                    "location": location,
                }, nil
            },
        },
    }
    
    response, err := ai.Chat(c.Context(), ai.ChatRequest{
        Messages: []ai.Message{
            {Role: "user", Content: "What's the weather in San Francisco?"},
        },
        Tools: tools,
        ToolChoice: "auto",
    })
    if err != nil {
        return c.JSON(500, map[string]string{"error": err.Error()})
    }
    
    return c.JSON(200, response)
}
func visionHandler(c forge.Context) error {
    ai := forge.GetAI(c)
    
    // Handle file upload
    file, err := c.FormFile("image")
    if err != nil {
        return c.JSON(400, map[string]string{"error": "No image provided"})
    }
    
    src, err := file.Open()
    if err != nil {
        return c.JSON(500, map[string]string{"error": "Failed to open image"})
    }
    defer src.Close()
    
    // Convert to base64
    imageData, err := io.ReadAll(src)
    if err != nil {
        return c.JSON(500, map[string]string{"error": "Failed to read image"})
    }
    
    response, err := ai.Chat(c.Context(), ai.ChatRequest{
        Messages: []ai.Message{
            {
                Role: "user",
                Content: "What do you see in this image?",
                Images: []ai.Image{
                    {
                        Data: base64.StdEncoding.EncodeToString(imageData),
                        MimeType: file.Header.Get("Content-Type"),
                    },
                },
            },
        },
        Model: "gpt-4-vision-preview",
    })
    if err != nil {
        return c.JSON(500, map[string]string{"error": err.Error()})
    }
    
    return c.JSON(200, map[string]string{
        "description": response.Content,
    })
}

AI Agents

func createAgentHandler(c forge.Context) error {
    ai := forge.GetAI(c)
    
    agent, err := ai.CreateAgent(c.Context(), ai.AgentConfig{
        Name: "customer-support",
        Description: "Customer support assistant",
        Instructions: `You are a helpful customer support agent. 
                      Be polite, professional, and try to resolve issues quickly.`,
        Model: "gpt-4",
        Memory: ai.MemoryConfig{
            Type: "conversation",
            MaxMessages: 50,
        },
    })
    if err != nil {
        return c.JSON(500, map[string]string{"error": err.Error()})
    }
    
    return c.JSON(201, map[string]string{
        "agent_id": agent.ID,
        "status": "created",
    })
}

func chatWithAgentHandler(c forge.Context) error {
    ai := forge.GetAI(c)
    agentID := c.Param("agent_id")
    
    var req struct {
        Message string `json:"message"`
    }
    if err := c.Bind(&req); err != nil {
        return c.JSON(400, map[string]string{"error": "Invalid request"})
    }
    
    response, err := ai.ChatWithAgent(c.Context(), agentID, req.Message)
    if err != nil {
        return c.JSON(500, map[string]string{"error": err.Error()})
    }
    
    return c.JSON(200, map[string]string{
        "response": response.Content,
        "agent_id": agentID,
    })
}
func createToolAgentHandler(c forge.Context) error {
    ai := forge.GetAI(c)
    
    // Define tools for the agent
    tools := []ai.Tool{
        {
            Name: "search_knowledge_base",
            Description: "Search the company knowledge base",
            Handler: func(args map[string]interface{}) (interface{}, error) {
                query := args["query"].(string)
                // Search knowledge base
                return searchKnowledgeBase(query), nil
            },
        },
        {
            Name: "create_ticket",
            Description: "Create a support ticket",
            Handler: func(args map[string]interface{}) (interface{}, error) {
                title := args["title"].(string)
                description := args["description"].(string)
                // Create ticket in system
                return createSupportTicket(title, description), nil
            },
        },
    }
    
    agent, err := ai.CreateAgent(c.Context(), ai.AgentConfig{
        Name: "support-agent-pro",
        Description: "Advanced support agent with tools",
        Instructions: `You are an advanced customer support agent with access to tools.
                      Use the knowledge base to find answers and create tickets when needed.`,
        Model: "gpt-4",
        Tools: tools,
        Memory: ai.MemoryConfig{
            Type: "persistent",
            Backend: "redis",
        },
    })
    if err != nil {
        return c.JSON(500, map[string]string{"error": err.Error()})
    }
    
    return c.JSON(201, agent)
}
func createAgentTeamHandler(c forge.Context) error {
    ai := forge.GetAI(c)
    
    // Create coordinator agent
    coordinator, err := ai.CreateAgent(c.Context(), ai.AgentConfig{
        Name: "coordinator",
        Description: "Coordinates other agents",
        Instructions: `You coordinate between specialist agents.
                      Route requests to the appropriate specialist.`,
        Model: "gpt-4",
    })
    if err != nil {
        return c.JSON(500, map[string]string{"error": err.Error()})
    }
    
    // Create specialist agents
    techAgent, err := ai.CreateAgent(c.Context(), ai.AgentConfig{
        Name: "tech-specialist",
        Description: "Technical support specialist",
        Instructions: "You handle technical issues and troubleshooting.",
        Model: "gpt-4",
    })
    if err != nil {
        return c.JSON(500, map[string]string{"error": err.Error()})
    }
    
    billingAgent, err := ai.CreateAgent(c.Context(), ai.AgentConfig{
        Name: "billing-specialist",
        Description: "Billing and account specialist",
        Instructions: "You handle billing questions and account issues.",
        Model: "gpt-4",
    })
    if err != nil {
        return c.JSON(500, map[string]string{"error": err.Error()})
    }
    
    // Create agent team
    team, err := ai.CreateAgentTeam(c.Context(), ai.AgentTeamConfig{
        Name: "support-team",
        Coordinator: coordinator.ID,
        Agents: []string{techAgent.ID, billingAgent.ID},
        Workflow: ai.WorkflowConfig{
            Type: "router",
            Rules: []ai.RoutingRule{
                {
                    Condition: "contains(message, 'technical') || contains(message, 'bug')",
                    Target: techAgent.ID,
                },
                {
                    Condition: "contains(message, 'billing') || contains(message, 'payment')",
                    Target: billingAgent.ID,
                },
            },
        },
    })
    if err != nil {
        return c.JSON(500, map[string]string{"error": err.Error()})
    }
    
    return c.JSON(201, team)
}

Inference Engine

func loadModelHandler(c forge.Context) error {
    ai := forge.GetAI(c)
    
    var req struct {
        ModelPath string `json:"model_path"`
        ModelType string `json:"model_type"` // "onnx", "pytorch", "tensorflow"
        Name      string `json:"name"`
    }
    if err := c.Bind(&req); err != nil {
        return c.JSON(400, map[string]string{"error": "Invalid request"})
    }
    
    model, err := ai.LoadModel(c.Context(), ai.ModelConfig{
        Name: req.Name,
        Path: req.ModelPath,
        Type: req.ModelType,
        Device: "gpu", // or "cpu"
        BatchSize: 32,
    })
    if err != nil {
        return c.JSON(500, map[string]string{"error": err.Error()})
    }
    
    return c.JSON(200, map[string]interface{}{
        "model_id": model.ID,
        "status": "loaded",
        "device": model.Device,
    })
}
func batchInferenceHandler(c forge.Context) error {
    ai := forge.GetAI(c)
    modelID := c.Param("model_id")
    
    var req struct {
        Inputs []map[string]interface{} `json:"inputs"`
    }
    if err := c.Bind(&req); err != nil {
        return c.JSON(400, map[string]string{"error": "Invalid request"})
    }
    
    results, err := ai.BatchInference(c.Context(), ai.BatchRequest{
        ModelID: modelID,
        Inputs:  req.Inputs,
        Options: ai.InferenceOptions{
            Timeout: time.Second * 30,
            Priority: "normal",
        },
    })
    if err != nil {
        return c.JSON(500, map[string]string{"error": err.Error()})
    }
    
    return c.JSON(200, map[string]interface{}{
        "results": results,
        "count": len(results),
    })
}
func setupModelServing() {
    app := forge.New()
    ai := forge.GetAI(app)
    
    // Serve model via REST API
    app.POST("/models/:model_id/predict", func(c forge.Context) error {
        modelID := c.Param("model_id")
        
        var input map[string]interface{}
        if err := c.Bind(&input); err != nil {
            return c.JSON(400, map[string]string{"error": "Invalid input"})
        }
        
        result, err := ai.Predict(c.Context(), ai.PredictRequest{
            ModelID: modelID,
            Input:   input,
        })
        if err != nil {
            return c.JSON(500, map[string]string{"error": err.Error()})
        }
        
        return c.JSON(200, result)
    })
    
    // Model health check
    app.GET("/models/:model_id/health", func(c forge.Context) error {
        modelID := c.Param("model_id")
        
        health, err := ai.ModelHealth(c.Context(), modelID)
        if err != nil {
            return c.JSON(500, map[string]string{"error": err.Error()})
        }
        
        return c.JSON(200, health)
    })
}

Advanced Features

Smart Caching

The AI extension includes intelligent caching to reduce costs and improve performance:

// Configure caching
ai := forge.GetAI(c)

// Cached chat completion
response, err := ai.ChatCached(c.Context(), ai.ChatRequest{
    Messages: []ai.Message{
        {Role: "user", Content: "What is the capital of France?"},
    },
    CacheOptions: ai.CacheOptions{
        TTL: time.Hour,
        Key: "geography:france:capital",
    },
})

Load Balancing

Distribute requests across multiple providers:

// Configure load balancing
ai := ai.New(ai.Config{
    LLM: ai.LLMConfig{
        LoadBalancing: ai.LoadBalancingConfig{
            Strategy: "round_robin", // or "weighted", "least_latency"
            Providers: []string{"openai", "anthropic", "azure"},
            Weights: map[string]int{
                "openai": 50,
                "anthropic": 30,
                "azure": 20,
            },
        },
    },
})

Monitoring and Observability

// Get AI metrics
app.GET("/ai/metrics", func(c forge.Context) error {
    ai := forge.GetAI(c)
    
    metrics := ai.GetMetrics()
    return c.JSON(200, metrics)
})

// AI health check
app.GET("/ai/health", func(c forge.Context) error {
    ai := forge.GetAI(c)
    
    health := ai.Health(c.Context())
    return c.JSON(200, health)
})

Best Practices

Performance Optimization

  • Use caching for repeated queries
  • Implement request batching for inference
  • Choose appropriate models for your use case
  • Monitor token usage and costs

Security

  • Secure API keys with environment variables
  • Implement rate limiting for AI endpoints
  • Validate and sanitize user inputs
  • Monitor for prompt injection attempts

Cost Management

  • Use caching to reduce API calls
  • Choose cost-effective models when possible
  • Implement usage quotas and limits
  • Monitor spending with built-in metrics

Error Handling

  • Implement fallback providers
  • Handle rate limits gracefully
  • Provide meaningful error messages
  • Log errors for debugging

Troubleshooting

Common Issues

API Key Issues

// Check provider configuration
health := ai.ProviderHealth(c.Context(), "openai")
if !health.Healthy {
    log.Printf("Provider issue: %s", health.Message)
}

Model Loading Failures

// Verify model path and format
models := ai.ListModels(c.Context())
for _, model := range models {
    log.Printf("Model: %s, Status: %s", model.Name, model.Status)
}

Memory Issues

// Monitor memory usage
stats := ai.GetStats()
log.Printf("Memory usage: %d MB", stats.MemoryUsage/1024/1024)

The AI extension requires significant computational resources. Ensure adequate memory and CPU/GPU resources for optimal performance.

Next Steps

Setup: Configure your AI providers and test basic functionality

Integration: Integrate AI features into your application

Optimization: Implement caching and load balancing

Monitoring: Set up comprehensive monitoring and alerting

Scaling: Scale your AI infrastructure for production

How is this guide?

Last updated on