AI Extension
Comprehensive AI platform with LLM integration, agents, inference engine, and model management
AI Extension
The AI extension is Forge's most comprehensive extension, providing a complete AI platform with LLM integration, AI agents, high-performance inference engine, model management, and advanced features like streaming, monitoring, and smart caching.
The AI extension supports multiple LLM providers including OpenAI, Anthropic, Azure OpenAI, Ollama, and HuggingFace with unified interfaces.
Features
LLM Integration
- Multiple Providers: OpenAI, Anthropic, Azure OpenAI, Ollama, HuggingFace
- Unified Interface: Consistent API across all providers
- Streaming Support: Real-time response streaming
- Function Calling: Tool use and function execution
- Vision Models: Image understanding and analysis
AI Agents
- Agent Framework: Build autonomous AI agents
- Tool Integration: Connect agents to external tools
- Memory Management: Persistent agent memory
- Multi-Agent Systems: Coordinate multiple agents
- Workflow Orchestration: Complex agent workflows
Inference Engine
- High Performance: Optimized for production workloads
- Model Formats: ONNX, PyTorch, TensorFlow support
- GPU Acceleration: CUDA and Metal support
- Batch Processing: Efficient batch inference
- Model Serving: REST and gRPC endpoints
Model Management
- Model Registry: Centralized model storage
- Version Control: Model versioning and rollback
- A/B Testing: Compare model performance
- Auto-scaling: Dynamic model scaling
- Health Monitoring: Model performance tracking
Installation
go get github.com/xraph/forge/extensions/aiConfiguration
extensions:
ai:
# LLM Configuration
llm:
default_provider: "openai"
providers:
openai:
api_key: "${OPENAI_API_KEY}"
model: "gpt-4"
max_tokens: 4096
temperature: 0.7
anthropic:
api_key: "${ANTHROPIC_API_KEY}"
model: "claude-3-sonnet-20240229"
max_tokens: 4096
azure:
api_key: "${AZURE_OPENAI_API_KEY}"
endpoint: "${AZURE_OPENAI_ENDPOINT}"
deployment: "gpt-4"
ollama:
endpoint: "http://localhost:11434"
model: "llama2"
# Inference Engine
inference:
enabled: true
gpu_enabled: true
max_batch_size: 32
timeout: "30s"
models_path: "./models"
# Agents
agents:
enabled: true
max_agents: 100
memory_backend: "redis"
tools_enabled: true
# Caching
cache:
enabled: true
backend: "redis"
ttl: "1h"
max_size: "1GB"
# Monitoring
monitoring:
enabled: true
metrics_interval: "10s"
log_requests: true
trace_enabled: true# LLM Provider Keys
export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."
export AZURE_OPENAI_API_KEY="..."
export AZURE_OPENAI_ENDPOINT="https://..."
# AI Extension Config
export AI_DEFAULT_PROVIDER="openai"
export AI_INFERENCE_ENABLED="true"
export AI_GPU_ENABLED="true"
export AI_CACHE_ENABLED="true"
export AI_MONITORING_ENABLED="true"package main
import (
"github.com/xraph/forge"
"github.com/xraph/forge/extensions/ai"
)
func main() {
app := forge.New()
// Configure AI extension
aiExt := ai.New(ai.Config{
LLM: ai.LLMConfig{
DefaultProvider: "openai",
Providers: map[string]ai.ProviderConfig{
"openai": {
APIKey: "sk-...",
Model: "gpt-4",
MaxTokens: 4096,
Temperature: 0.7,
},
},
},
Inference: ai.InferenceConfig{
Enabled: true,
GPUEnabled: true,
MaxBatchSize: 32,
},
Agents: ai.AgentsConfig{
Enabled: true,
MaxAgents: 100,
MemoryBackend: "redis",
},
})
app.RegisterExtension(aiExt)
app.Run()
}Usage Examples
Basic LLM Usage
func chatHandler(c forge.Context) error {
ai := forge.GetAI(c)
var req struct {
Message string `json:"message"`
}
if err := c.Bind(&req); err != nil {
return c.JSON(400, map[string]string{"error": "Invalid request"})
}
response, err := ai.Chat(c.Context(), ai.ChatRequest{
Messages: []ai.Message{
{Role: "user", Content: req.Message},
},
Model: "gpt-4",
MaxTokens: 1000,
})
if err != nil {
return c.JSON(500, map[string]string{"error": err.Error()})
}
return c.JSON(200, map[string]string{
"response": response.Content,
})
}func streamChatHandler(c forge.Context) error {
ai := forge.GetAI(c)
// Set up SSE headers
c.Response().Header().Set("Content-Type", "text/event-stream")
c.Response().Header().Set("Cache-Control", "no-cache")
c.Response().Header().Set("Connection", "keep-alive")
stream, err := ai.ChatStream(c.Context(), ai.ChatRequest{
Messages: []ai.Message{
{Role: "user", Content: "Tell me a story"},
},
Model: "gpt-4",
Stream: true,
})
if err != nil {
return err
}
defer stream.Close()
for chunk := range stream.Channel() {
if chunk.Error != nil {
break
}
data := fmt.Sprintf("data: %s\n\n", chunk.Content)
if _, err := c.Response().Write([]byte(data)); err != nil {
break
}
c.Response().Flush()
}
return nil
}func functionCallHandler(c forge.Context) error {
ai := forge.GetAI(c)
// Define available functions
tools := []ai.Tool{
{
Name: "get_weather",
Description: "Get current weather for a location",
Parameters: ai.ToolParameters{
Type: "object",
Properties: map[string]ai.Property{
"location": {
Type: "string",
Description: "City name",
},
},
Required: []string{"location"},
},
Handler: func(args map[string]interface{}) (interface{}, error) {
location := args["location"].(string)
// Call weather API
return map[string]interface{}{
"temperature": 22,
"condition": "sunny",
"location": location,
}, nil
},
},
}
response, err := ai.Chat(c.Context(), ai.ChatRequest{
Messages: []ai.Message{
{Role: "user", Content: "What's the weather in San Francisco?"},
},
Tools: tools,
ToolChoice: "auto",
})
if err != nil {
return c.JSON(500, map[string]string{"error": err.Error()})
}
return c.JSON(200, response)
}func visionHandler(c forge.Context) error {
ai := forge.GetAI(c)
// Handle file upload
file, err := c.FormFile("image")
if err != nil {
return c.JSON(400, map[string]string{"error": "No image provided"})
}
src, err := file.Open()
if err != nil {
return c.JSON(500, map[string]string{"error": "Failed to open image"})
}
defer src.Close()
// Convert to base64
imageData, err := io.ReadAll(src)
if err != nil {
return c.JSON(500, map[string]string{"error": "Failed to read image"})
}
response, err := ai.Chat(c.Context(), ai.ChatRequest{
Messages: []ai.Message{
{
Role: "user",
Content: "What do you see in this image?",
Images: []ai.Image{
{
Data: base64.StdEncoding.EncodeToString(imageData),
MimeType: file.Header.Get("Content-Type"),
},
},
},
},
Model: "gpt-4-vision-preview",
})
if err != nil {
return c.JSON(500, map[string]string{"error": err.Error()})
}
return c.JSON(200, map[string]string{
"description": response.Content,
})
}AI Agents
func createAgentHandler(c forge.Context) error {
ai := forge.GetAI(c)
agent, err := ai.CreateAgent(c.Context(), ai.AgentConfig{
Name: "customer-support",
Description: "Customer support assistant",
Instructions: `You are a helpful customer support agent.
Be polite, professional, and try to resolve issues quickly.`,
Model: "gpt-4",
Memory: ai.MemoryConfig{
Type: "conversation",
MaxMessages: 50,
},
})
if err != nil {
return c.JSON(500, map[string]string{"error": err.Error()})
}
return c.JSON(201, map[string]string{
"agent_id": agent.ID,
"status": "created",
})
}
func chatWithAgentHandler(c forge.Context) error {
ai := forge.GetAI(c)
agentID := c.Param("agent_id")
var req struct {
Message string `json:"message"`
}
if err := c.Bind(&req); err != nil {
return c.JSON(400, map[string]string{"error": "Invalid request"})
}
response, err := ai.ChatWithAgent(c.Context(), agentID, req.Message)
if err != nil {
return c.JSON(500, map[string]string{"error": err.Error()})
}
return c.JSON(200, map[string]string{
"response": response.Content,
"agent_id": agentID,
})
}func createToolAgentHandler(c forge.Context) error {
ai := forge.GetAI(c)
// Define tools for the agent
tools := []ai.Tool{
{
Name: "search_knowledge_base",
Description: "Search the company knowledge base",
Handler: func(args map[string]interface{}) (interface{}, error) {
query := args["query"].(string)
// Search knowledge base
return searchKnowledgeBase(query), nil
},
},
{
Name: "create_ticket",
Description: "Create a support ticket",
Handler: func(args map[string]interface{}) (interface{}, error) {
title := args["title"].(string)
description := args["description"].(string)
// Create ticket in system
return createSupportTicket(title, description), nil
},
},
}
agent, err := ai.CreateAgent(c.Context(), ai.AgentConfig{
Name: "support-agent-pro",
Description: "Advanced support agent with tools",
Instructions: `You are an advanced customer support agent with access to tools.
Use the knowledge base to find answers and create tickets when needed.`,
Model: "gpt-4",
Tools: tools,
Memory: ai.MemoryConfig{
Type: "persistent",
Backend: "redis",
},
})
if err != nil {
return c.JSON(500, map[string]string{"error": err.Error()})
}
return c.JSON(201, agent)
}func createAgentTeamHandler(c forge.Context) error {
ai := forge.GetAI(c)
// Create coordinator agent
coordinator, err := ai.CreateAgent(c.Context(), ai.AgentConfig{
Name: "coordinator",
Description: "Coordinates other agents",
Instructions: `You coordinate between specialist agents.
Route requests to the appropriate specialist.`,
Model: "gpt-4",
})
if err != nil {
return c.JSON(500, map[string]string{"error": err.Error()})
}
// Create specialist agents
techAgent, err := ai.CreateAgent(c.Context(), ai.AgentConfig{
Name: "tech-specialist",
Description: "Technical support specialist",
Instructions: "You handle technical issues and troubleshooting.",
Model: "gpt-4",
})
if err != nil {
return c.JSON(500, map[string]string{"error": err.Error()})
}
billingAgent, err := ai.CreateAgent(c.Context(), ai.AgentConfig{
Name: "billing-specialist",
Description: "Billing and account specialist",
Instructions: "You handle billing questions and account issues.",
Model: "gpt-4",
})
if err != nil {
return c.JSON(500, map[string]string{"error": err.Error()})
}
// Create agent team
team, err := ai.CreateAgentTeam(c.Context(), ai.AgentTeamConfig{
Name: "support-team",
Coordinator: coordinator.ID,
Agents: []string{techAgent.ID, billingAgent.ID},
Workflow: ai.WorkflowConfig{
Type: "router",
Rules: []ai.RoutingRule{
{
Condition: "contains(message, 'technical') || contains(message, 'bug')",
Target: techAgent.ID,
},
{
Condition: "contains(message, 'billing') || contains(message, 'payment')",
Target: billingAgent.ID,
},
},
},
})
if err != nil {
return c.JSON(500, map[string]string{"error": err.Error()})
}
return c.JSON(201, team)
}Inference Engine
func loadModelHandler(c forge.Context) error {
ai := forge.GetAI(c)
var req struct {
ModelPath string `json:"model_path"`
ModelType string `json:"model_type"` // "onnx", "pytorch", "tensorflow"
Name string `json:"name"`
}
if err := c.Bind(&req); err != nil {
return c.JSON(400, map[string]string{"error": "Invalid request"})
}
model, err := ai.LoadModel(c.Context(), ai.ModelConfig{
Name: req.Name,
Path: req.ModelPath,
Type: req.ModelType,
Device: "gpu", // or "cpu"
BatchSize: 32,
})
if err != nil {
return c.JSON(500, map[string]string{"error": err.Error()})
}
return c.JSON(200, map[string]interface{}{
"model_id": model.ID,
"status": "loaded",
"device": model.Device,
})
}func batchInferenceHandler(c forge.Context) error {
ai := forge.GetAI(c)
modelID := c.Param("model_id")
var req struct {
Inputs []map[string]interface{} `json:"inputs"`
}
if err := c.Bind(&req); err != nil {
return c.JSON(400, map[string]string{"error": "Invalid request"})
}
results, err := ai.BatchInference(c.Context(), ai.BatchRequest{
ModelID: modelID,
Inputs: req.Inputs,
Options: ai.InferenceOptions{
Timeout: time.Second * 30,
Priority: "normal",
},
})
if err != nil {
return c.JSON(500, map[string]string{"error": err.Error()})
}
return c.JSON(200, map[string]interface{}{
"results": results,
"count": len(results),
})
}func setupModelServing() {
app := forge.New()
ai := forge.GetAI(app)
// Serve model via REST API
app.POST("/models/:model_id/predict", func(c forge.Context) error {
modelID := c.Param("model_id")
var input map[string]interface{}
if err := c.Bind(&input); err != nil {
return c.JSON(400, map[string]string{"error": "Invalid input"})
}
result, err := ai.Predict(c.Context(), ai.PredictRequest{
ModelID: modelID,
Input: input,
})
if err != nil {
return c.JSON(500, map[string]string{"error": err.Error()})
}
return c.JSON(200, result)
})
// Model health check
app.GET("/models/:model_id/health", func(c forge.Context) error {
modelID := c.Param("model_id")
health, err := ai.ModelHealth(c.Context(), modelID)
if err != nil {
return c.JSON(500, map[string]string{"error": err.Error()})
}
return c.JSON(200, health)
})
}Advanced Features
Smart Caching
The AI extension includes intelligent caching to reduce costs and improve performance:
// Configure caching
ai := forge.GetAI(c)
// Cached chat completion
response, err := ai.ChatCached(c.Context(), ai.ChatRequest{
Messages: []ai.Message{
{Role: "user", Content: "What is the capital of France?"},
},
CacheOptions: ai.CacheOptions{
TTL: time.Hour,
Key: "geography:france:capital",
},
})Load Balancing
Distribute requests across multiple providers:
// Configure load balancing
ai := ai.New(ai.Config{
LLM: ai.LLMConfig{
LoadBalancing: ai.LoadBalancingConfig{
Strategy: "round_robin", // or "weighted", "least_latency"
Providers: []string{"openai", "anthropic", "azure"},
Weights: map[string]int{
"openai": 50,
"anthropic": 30,
"azure": 20,
},
},
},
})Monitoring and Observability
// Get AI metrics
app.GET("/ai/metrics", func(c forge.Context) error {
ai := forge.GetAI(c)
metrics := ai.GetMetrics()
return c.JSON(200, metrics)
})
// AI health check
app.GET("/ai/health", func(c forge.Context) error {
ai := forge.GetAI(c)
health := ai.Health(c.Context())
return c.JSON(200, health)
})Best Practices
Performance Optimization
- Use caching for repeated queries
- Implement request batching for inference
- Choose appropriate models for your use case
- Monitor token usage and costs
Security
- Secure API keys with environment variables
- Implement rate limiting for AI endpoints
- Validate and sanitize user inputs
- Monitor for prompt injection attempts
Cost Management
- Use caching to reduce API calls
- Choose cost-effective models when possible
- Implement usage quotas and limits
- Monitor spending with built-in metrics
Error Handling
- Implement fallback providers
- Handle rate limits gracefully
- Provide meaningful error messages
- Log errors for debugging
Troubleshooting
Common Issues
API Key Issues
// Check provider configuration
health := ai.ProviderHealth(c.Context(), "openai")
if !health.Healthy {
log.Printf("Provider issue: %s", health.Message)
}Model Loading Failures
// Verify model path and format
models := ai.ListModels(c.Context())
for _, model := range models {
log.Printf("Model: %s, Status: %s", model.Name, model.Status)
}Memory Issues
// Monitor memory usage
stats := ai.GetStats()
log.Printf("Memory usage: %d MB", stats.MemoryUsage/1024/1024)The AI extension requires significant computational resources. Ensure adequate memory and CPU/GPU resources for optimal performance.
Next Steps
Setup: Configure your AI providers and test basic functionality
Integration: Integrate AI features into your application
Optimization: Implement caching and load balancing
Monitoring: Set up comprehensive monitoring and alerting
Scaling: Scale your AI infrastructure for production
How is this guide?
Last updated on