Cost Management

Usage tracking, budget enforcement, and optimization recommendations

The CostTracker monitors token usage and costs across all LLM operations, enforces budgets with alerts, and provides optimization recommendations.

Creating a Cost Tracker

import sdk "github.com/xraph/ai-sdk"

tracker := sdk.NewCostTracker(logger, metrics, &sdk.CostTrackerOptions{
    RetentionPeriod: 30 * 24 * time.Hour, // Keep 30 days of records
    EnableCache:     true,
})

Recording Usage

tracker.RecordUsage(ctx, sdk.UsageRecord{
    Timestamp:    time.Now(),
    Provider:     "openai",
    Model:        "gpt-4",
    Operation:    "chat",
    InputTokens:  150,
    OutputTokens: 300,
    TotalTokens:  450,
    Cost:         0.0195,
    CacheHit:     false,
    Metadata:     map[string]any{"agent": "research-assistant"},
})

Budget Management

Set spending limits with alert thresholds:

tracker.SetBudget("daily", 10.0, 24*time.Hour, 0.8)
// Name: "daily"
// Limit: $10.00 per period
// Period: 24 hours
// AlertAt: 80% ($8.00)

tracker.SetBudget("monthly", 300.0, 30*24*time.Hour, 0.9)

Cost Insights

Get analytics on spending:

insights := tracker.GetInsights()

fmt.Printf("Total cost: $%.2f\n", insights.TotalCost)
fmt.Printf("Total tokens: %d\n", insights.TotalTokens)
fmt.Printf("Average cost per request: $%.4f\n", insights.AverageCostPerRequest)
fmt.Printf("Most expensive model: %s\n", insights.MostExpensiveModel)
fmt.Printf("Cache hit rate: %.1f%%\n", insights.CacheHitRate*100)

Optimization Recommendations

recommendations := tracker.GetOptimizationRecommendations()

for _, rec := range recommendations {
    fmt.Printf("[%s] %s\n", rec.Priority, rec.Description)
    fmt.Printf("  Potential savings: $%.2f/month\n", rec.EstimatedSavings)
}

Recommendations may include:

  • Switching to a cheaper model for simple tasks
  • Enabling caching for repeated queries
  • Reducing token usage with shorter prompts
  • Using batch operations

Model Pricing

The SDK includes default pricing for common models:

var DefaultModelPricing = map[string]ModelPricing{
    "openai/gpt-4":          {InputPer1KTokens: 0.03, OutputPer1KTokens: 0.06},
    "openai/gpt-4-turbo":    {InputPer1KTokens: 0.01, OutputPer1KTokens: 0.03},
    "openai/gpt-3.5-turbo":  {InputPer1KTokens: 0.0005, OutputPer1KTokens: 0.0015},
    "anthropic/claude-3-opus": {InputPer1KTokens: 0.015, OutputPer1KTokens: 0.075},
    // ...
}

UsageRecord Fields

FieldTypeDescription
Timestamptime.TimeWhen the usage occurred
ProviderstringLLM provider name
ModelstringModel used
OperationstringOperation type (chat, completion, embedding)
InputTokensintTokens in the prompt
OutputTokensintTokens in the response
TotalTokensintTotal tokens consumed
Costfloat64Computed cost in USD
CacheHitboolWhether the result was cached
Metadatamap[string]anyCustom metadata

How is this guide?

On this page