Session 8: Agno Production-Ready Agents¶
Learning Navigation Hub¶
Total Time Investment: 85 minutes (Core) + 30-230 minutes (Optional) Your Learning Path: Choose your engagement level
Quick Start Guide¶
- Observer (50 min): Read production concepts + examine architecture patterns
- Participant (85 min): Follow deployment exercises + implement monitoring
- Implementer (150 min): Build production systems + explore enterprise patterns
Session Overview Dashboard¶
Core Learning Track (85 minutes) - REQUIRED¶
Section | Concept Load | Time | Skills |
---|---|---|---|
Production Architecture | 3 concepts | 25 min | Understanding |
Essential Production Patterns | 4 concepts | 30 min | Implementation |
Deployment Fundamentals | 4 concepts | 25 min | Application |
✅ Production Validation | 3 concepts | 5 min | Verification |
Optional Deep Dive Modules (Choose Your Adventure)¶
- 🔬 Module A: Advanced Monitoring & Observability (65 min) - Comprehensive monitoring systems
- 🏭 Module B: Enterprise Scaling & Architecture (70 min) - Kubernetes & auto-scaling
- 🔧 Module C: Performance Optimization (50 min) - Cost management & caching
- 🛡️ Module D: Security & Compliance (45 min) - Enterprise security patterns
Code Files: All examples use files in src/session8/
Quick Start: Run cd src/session8 && python agno_foundation.py
to see Agno in action
Core Section (Required - 85 minutes)¶
Part 1: Production Architecture & Mindset (25 minutes)¶
Cognitive Load: 3 new concepts Learning Mode: Conceptual Understanding
Production vs Development Mindset (10 minutes)¶
Agno emphasizes production-first thinking from the start. These imports provide built-in monitoring, persistent storage, and enterprise-grade tools that distinguish Agno from development-focused frameworks:
File: src/session8/agno_foundation.py
- Core Agno implementation and setup
# Essential Agno imports for production
from agno import Agent, Workflow
from agno.storage import PostgresStorage
from agno.monitoring import PrometheusExporter
from agno.tools import DuckDuckGo, FileTools
Key Mindset Shift: | Development Focus | Production Focus | |-------------------|------------------|
| Speed of iteration | Reliability & stability | | Feature completeness | Performance & efficiency | | Local testing | Distributed monitoring | | Manual intervention OK | Automated recovery required |
Agno Framework Overview (8 minutes)¶
Agno's production-ready architecture:
This diagram showcases Agno's production-first architecture, highlighting built-in monitoring, persistent storage, and reliability features. The architecture emphasizes operational excellence with integrated telemetry, structured logging, and automated recovery mechanisms.
This basic Agno agent setup demonstrates how production features like persistent storage and monitoring are built into the framework from the ground up:
File: src/session8/agno_foundation.py
- Basic production agent setup
# Basic Agno agent with production features
from agno import Agent
from agno.storage import PostgresStorage
# Agent with persistent storage
storage = PostgresStorage(
host="localhost",
db="production_agents",
table="agent_sessions"
)
production_agent = Agent(
name="ProductionAssistant",
model="gpt-4",
storage=storage,
monitoring=True, # Built-in metrics
debug_mode=False # Production optimized
)
Agno's Production Advantages: 1. Built-in Storage: PostgreSQL integration out of the box 2. Monitoring Ready: Prometheus metrics included 3. Multi-Provider: 23+ LLM providers supported 4. Docker Ready: Production deployment containers
Enterprise Agent Architecture (7 minutes)¶
Building scalable agent systems:
# Enterprise agent configuration
class ProductionConfig:
# Model configuration
PRIMARY_MODEL = "gpt-4"
FALLBACK_MODEL = "gpt-3.5-turbo"
# Performance settings
MAX_RETRIES = 3
TIMEOUT_SECONDS = 30
CONCURRENT_REQUESTS = 10
# Storage configuration
DATABASE_URL = "postgresql://user:pass@localhost:5432/agents"
# Monitoring
ENABLE_METRICS = True
METRICS_PORT = 8080
# Create production-ready agent
def create_enterprise_agent():
return Agent(
name="EnterpriseAgent",
model=ProductionConfig.PRIMARY_MODEL,
storage=PostgresStorage(ProductionConfig.DATABASE_URL),
tools=[DuckDuckGo(), FileTools()],
show_tool_calls=False, # Clean logs for production
monitoring=ProductionConfig.ENABLE_METRICS
)
Part 2: Essential Production Patterns (30 minutes)¶
Cognitive Load: 4 new concepts Learning Mode: Implementation & Practice
Monitoring & Observability (10 minutes)¶
Built-in monitoring for production systems:
This visualization demonstrates Agno's comprehensive telemetry and debugging capabilities, showing real-time metrics collection, distributed tracing, error tracking, and performance monitoring dashboards that enable effective production operations and troubleshooting.
Setting up comprehensive monitoring involves configuring structured logging, Prometheus metrics, and agent-level monitoring. This configuration ensures you can track performance, errors, and usage patterns in production:
File: src/session8/structured_outputs.py
- Monitoring and telemetry setup
from agno.monitoring import PrometheusExporter
import logging
# Set up production logging
logging.basicConfig(
level=logging.INFO,
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)
# Agent with comprehensive monitoring
monitored_agent = Agent(
name="MonitoredAgent",
model="gpt-4",
monitoring=True,
tools=[DuckDuckGo()]
)
# Custom monitoring setup
prometheus_exporter = PrometheusExporter(
agent=monitored_agent,
port=8080,
metrics=[
"request_count",
"response_time",
"error_rate",
"token_usage"
]
)
Error Handling & Recovery (8 minutes)¶
Production-grade error handling requires implementing retry mechanisms with exponential backoff, proper exception handling, and graceful degradation. This pattern ensures your agents can handle transient failures and maintain service availability:
File: src/session8/performance_resilience.py
- Error handling and recovery patterns
import asyncio
from typing import Optional
class RobustAgentWrapper:
def __init__(self, agent: Agent, max_retries: int = 3):
self.agent = agent
self.max_retries = max_retries
async def run_with_retry(self, message: str) -> Optional[str]:
"""Execute agent with exponential backoff retry."""
for attempt in range(self.max_retries):
try:
response = await self.agent.arun(message)
return response.content
except Exception as e:
wait_time = 2 ** attempt
logging.warning(f"Attempt {attempt + 1} failed: {e}")
if attempt == self.max_retries - 1:
logging.error(f"All retries failed for: {message}")
return None
await asyncio.sleep(wait_time)
return None
# Usage
robust_agent = RobustAgentWrapper(monitored_agent)
result = await robust_agent.run_with_retry("Analyze market trends")
Resource Management (7 minutes)¶
Managing resources in production involves controlling memory usage, managing database connections, and implementing session limits. This resource manager pattern helps prevent resource exhaustion and ensures consistent performance:
File: src/session8/performance_resilience.py
- Resource management patterns
from agno.storage import PostgresStorage
from contextlib import asynccontextmanager
class ResourceManager:
def __init__(self, max_sessions: int = 100):
self.max_sessions = max_sessions
self.active_sessions = {}
self.storage = PostgresStorage()
@asynccontextmanager
async def get_agent_session(self, session_id: str):
"""Context manager for agent sessions."""
if len(self.active_sessions) >= self.max_sessions:
raise Exception("Maximum sessions reached")
try:
# Create agent for session
agent = Agent(
name=f"Agent_{session_id}",
model="gpt-4",
storage=self.storage,
session_id=session_id
)
self.active_sessions[session_id] = agent
yield agent
finally:
# Cleanup session
if session_id in self.active_sessions:
del self.active_sessions[session_id]
# Save session data
await self.storage.save_session(session_id)
# Usage
resource_manager = ResourceManager()
async with resource_manager.get_agent_session("user_123") as agent:
response = await agent.arun("Help with task")
print(response.content)
Performance Optimization (5 minutes)¶
Optimizing for production performance involves implementing intelligent caching, connection pooling, and response optimization. This performance-optimized agent demonstrates how to reduce latency and improve throughput:
File: src/session8/performance_resilience.py
- Performance optimization patterns
from agno.cache import RedisCache
# Production performance configuration
class PerformanceOptimizedAgent:
def __init__(self):
self.cache = RedisCache(
host="localhost",
port=6379,
ttl=3600 # 1 hour cache
)
self.agent = Agent(
name="OptimizedAgent",
model="gpt-4",
# Performance settings
temperature=0.7,
max_tokens=1000,
# Caching
cache=self.cache,
# Connection pooling
max_connections=20
)
async def run_cached(self, message: str) -> str:
"""Run with intelligent caching."""
cache_key = f"agent_response_{hash(message)}"
# Check cache first
cached_response = await self.cache.get(cache_key)
if cached_response:
return cached_response
# Generate and cache response
response = await self.agent.arun(message)
await self.cache.set(cache_key, response.content)
return response.content
# Usage
optimized_agent = PerformanceOptimizedAgent()
result = await optimized_agent.run_cached("Common query")
Part 3: Deployment Fundamentals (25 minutes)¶
Cognitive Load: 4 new concepts Learning Mode: Application & Deployment
Docker Deployment (8 minutes)¶
Containerizing Agno applications requires careful configuration of the container environment, dependencies, and health checks. This Dockerfile demonstrates production-ready containerization with proper security and monitoring:
File: src/session8/Dockerfile
- Production container configuration
# Dockerfile for Agno production deployment
FROM python:3.11-slim
WORKDIR /app
# Install dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Copy application code
COPY . .
# Environment variables
ENV PYTHONPATH=/app
ENV AGNO_ENV=production
# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=60s --retries=3 \
CMD python -c "import requests; requests.get('http://localhost:8000/health')"
# Run application
EXPOSE 8000
CMD ["python", "main.py"]
This production server implementation shows how to expose Agno agents through a REST API with proper error handling, health checks, and request validation:
File: src/session8/agno_foundation.py
- Production API server
# main.py - Production server
from fastapi import FastAPI, HTTPException
from agno import Agent
from pydantic import BaseModel
import uvicorn
app = FastAPI(title="Agno Production API")
# Initialize agent
production_agent = Agent(
name="ProductionAPI",
model="gpt-4",
storage=PostgresStorage(),
monitoring=True
)
class QueryRequest(BaseModel):
message: str
session_id: str
@app.post("/query")
async def query_agent(request: QueryRequest):
try:
response = await production_agent.arun(
request.message,
session_id=request.session_id
)
return {"response": response.content}
except Exception as e:
raise HTTPException(status_code=500, detail=str(e))
@app.get("/health")
async def health_check():
return {"status": "healthy", "service": "agno-api"}
if __name__ == "__main__":
uvicorn.run(app, host="0.0.0.0", port=8000)
Basic Scaling (7 minutes)¶
Simple scaling strategies involve load balancing, service replication, and shared resources like databases and caches. This Docker Compose configuration demonstrates horizontal scaling with multiple service instances:
File: src/session8/docker-compose.yml
- Scaling configuration
# docker-compose.yml for basic scaling
version: '3.8'
services:
agno-api:
build: .
ports:
- "8000-8002:8000"
environment:
- DATABASE_URL=postgresql://user:pass@postgres:5432/agno
- REDIS_URL=redis://redis:6379
depends_on:
- postgres
- redis
deploy:
replicas: 3
postgres:
image: postgres:13
environment:
- POSTGRES_DB=agno
- POSTGRES_USER=user
- POSTGRES_PASSWORD=pass
volumes:
- postgres_data:/var/lib/postgresql/data
redis:
image: redis:alpine
nginx:
image: nginx:alpine
ports:
- "80:80"
volumes:
- ./nginx.conf:/etc/nginx/nginx.conf
depends_on:
- agno-api
volumes:
postgres_data:
Health Monitoring (5 minutes)¶
Production health checks must verify all system dependencies including database connectivity, cache availability, and agent responsiveness. This comprehensive health checker provides detailed system status information:
File: src/session8/team_coordination.py
- Health monitoring and system checks
from agno.monitoring import HealthChecker
import asyncio
class ProductionHealthChecker:
def __init__(self, agent: Agent):
self.agent = agent
self.health_checker = HealthChecker(agent)
async def comprehensive_health_check(self):
"""Comprehensive system health verification."""
checks = {
"agent_responsive": False,
"database_connected": False,
"cache_available": False,
"memory_usage": "unknown"
}
try:
# Test agent responsiveness
response = await self.agent.arun("health check", timeout=5)
checks["agent_responsive"] = bool(response.content)
# Test database connection
if hasattr(self.agent.storage, 'test_connection'):
checks["database_connected"] = await self.agent.storage.test_connection()
# Test cache availability
if hasattr(self.agent, 'cache'):
checks["cache_available"] = await self.agent.cache.ping()
# Check memory usage
import psutil
memory = psutil.virtual_memory()
checks["memory_usage"] = f"{memory.percent}%"
except Exception as e:
checks["error"] = str(e)
return checks
# Usage in FastAPI
@app.get("/health/detailed")
async def detailed_health():
health_checker = ProductionHealthChecker(production_agent)
return await health_checker.comprehensive_health_check()
Security Essentials (5 minutes)¶
Basic production security:
from fastapi import FastAPI, HTTPException, Depends
from fastapi.security import HTTPBearer, HTTPAuthorizationCredentials
import jwt
app = FastAPI()
security = HTTPBearer()
def verify_token(credentials: HTTPAuthorizationCredentials = Depends(security)):
"""Verify JWT token for API access."""
try:
payload = jwt.decode(
credentials.credentials,
"your-secret-key",
algorithms=["HS256"]
)
return payload
except jwt.ExpiredSignatureError:
raise HTTPException(status_code=401, detail="Token expired")
except jwt.InvalidTokenError:
raise HTTPException(status_code=401, detail="Invalid token")
@app.post("/secure-query")
async def secure_query(
request: QueryRequest,
user_info = Depends(verify_token)
):
# Log user access
logging.info(f"User {user_info.get('user_id')} made query")
# Rate limiting (simplified)
# In production, use Redis-based rate limiting
response = await production_agent.arun(request.message)
return {"response": response.content, "user": user_info["user_id"]}
Part 4: Production Validation (5 minutes)¶
Cognitive Load: 3 new concepts Learning Mode: Verification & Checklist
Production Readiness Checklist (5 minutes)¶
Essential verification steps:
class ProductionReadinessChecker:
def __init__(self, agent: Agent):
self.agent = agent
async def validate_production_readiness(self):
"""Comprehensive production readiness assessment."""
checklist = {
"✅ Configuration": {
"environment_variables": self._check_env_vars(),
"database_configured": self._check_database(),
"monitoring_enabled": self._check_monitoring()
},
"✅ Performance": {
"response_time": await self._check_response_time(),
"concurrent_handling": await self._check_concurrency(),
"resource_limits": self._check_resource_limits()
},
"✅ Reliability": {
"error_handling": self._check_error_handling(),
"retry_logic": self._check_retry_logic(),
"graceful_degradation": self._check_degradation()
},
"✅ Security": {
"authentication": self._check_auth(),
"input_validation": self._check_validation(),
"rate_limiting": self._check_rate_limits()
}
}
return checklist
def _check_env_vars(self) -> bool:
"""Check required environment variables."""
required_vars = ["DATABASE_URL", "API_KEY", "SECRET_KEY"]
import os
return all(os.getenv(var) for var in required_vars)
async def _check_response_time(self) -> str:
"""Measure average response time."""
import time
start = time.time()
await self.agent.arun("test query")
duration = time.time() - start
return f"{duration:.2f}s"
# Quick validation
checker = ProductionReadinessChecker(production_agent)
readiness = await checker.validate_production_readiness()
Core Section Validation (5 minutes)¶
Quick Implementation Exercise¶
Deploy a basic production Agno agent:
# Challenge: Create a production-ready agent service
def create_production_service():
# 1. Set up agent with monitoring
# 2. Add error handling
# 3. Configure health checks
# 4. Add basic security
pass
# Test your implementation
service = create_production_service()
health = await service.check_health()
Self-Assessment Checklist¶
- I understand production vs development mindset
- I can set up monitoring and error handling
- I can deploy agents with Docker and scaling
- I understand production security basics
- I'm ready to explore optional modules or move to next session
Next Session Prerequisites: ✅ Core Section Complete Recommended: Explore advanced modules for enterprise deployment
Choose Your Next Path:¶
- Module A: Advanced Monitoring & Observability → (65 min) - Comprehensive monitoring systems
- Module B: Enterprise Scaling & Architecture → (70 min) - Kubernetes & auto-scaling
- 🔧 Module C: Performance Optimization → (50 min) - Cost management & caching
- 🛡️ Module D: Security & Compliance → (45 min) - Enterprise security patterns
Complete Learning Path Options¶
Production Focus: Core → Module A → Module B
Performance Focus: Core → Module C → Module A
Security Focus: Core → Module D → Module A
Enterprise Complete: Core → All Modules
LEARNING REINFORCEMENT¶
Spaced Repetition Schedule¶
- Day 1: Complete core production concepts ✅
- Day 3: Review monitoring patterns (10 min)
- Week 1: Deploy a test production system (45 min)
- Week 2: Implement full monitoring stack (60 min)
Cross-Session Integration¶
Connections to Other Sessions: - Session 6: Build on atomic agents for production - Session 9: Multi-agent patterns in production - Session 10: Enterprise integration patterns
Knowledge Application Projects¶
- Simple: Deploy a monitored agent with Docker
- Intermediate: Create auto-scaling agent service
- Advanced: Build enterprise multi-agent platform
Progress Tracking¶
Completion Status¶
- Core Section (85 min) - Essential for next session
- Module A: Monitoring & Observability (65 min)
- Module B: Enterprise Scaling (70 min)
- Module C: Performance Optimization (50 min)
- Module D: Security & Compliance (45 min)
Time Investment Tracking¶
- Minimum Path: 85 minutes (Core only)
- Production Path: 150 minutes (Core + Module A)
- Enterprise Path: 155 minutes (Core + Module B)
- Comprehensive Path: 315 minutes (Core + all modules)
Next Steps¶
- To Session 9: Complete Core Section
- For Production Deployment: Add Module A
- For Enterprise Scale: Add Module B
- For Security Focus: Add Module D
Multiple Choice Test - Session 8 (15 minutes)¶
Test your understanding of Agno production-ready agent systems.
Question 1¶
What is Agno's primary advantage over other agent frameworks?
A) Simplest learning curve
B) Performance optimization and production focus
C) Best documentation
D) Largest community
Question 2¶
How does Agno achieve better performance than traditional frameworks?
A) Better algorithms
B) Optimized memory usage and faster agent instantiation
C) More CPU cores
D) Cloud-only deployment
Question 3¶
What is the purpose of circuit breaker patterns in production agents?
A) Improve performance
B) Prevent cascading failures in distributed systems
C) Reduce costs
D) Simplify deployment
Question 4¶
How should you handle API rate limits in production agent systems?
A) Ignore them
B) Exponential backoff and jitter strategies
C) Faster requests
D) Multiple API keys
Question 5¶
What makes a health check endpoint effective?
A) Fast response time only
B) Comprehensive dependency and resource checks
C) Simple HTTP 200 response
D) Authentication requirements
Question 6¶
Which monitoring approach is most suitable for production agents?
A) Log files only
B) Comprehensive metrics with alerting and observability
C) Manual monitoring
D) Error counts only
Question 7¶
How should production agent configurations be managed?
A) Hard-coded values
B) Environment variables and external config management
C) Database storage
D) Code comments
Question 8¶
What is the most important aspect of production error handling?
A) Hiding errors from users
B) Graceful degradation with meaningful error responses
C) Immediate system restart
D) Detailed error messages to all users
Question 9¶
How should you approach scaling production agent systems?
A) Vertical scaling only
B) Horizontal scaling with load balancing and auto-scaling
C) Manual scaling
D) Single instance deployment
Question 10¶
What security measures are essential for production agents?
A) Password protection only
B) Authentication, authorization, encryption, and audit logging
C) Network isolation only
D) No security needed
Navigation¶
Previous: Session 7 - First ADK Agent
Optional Deep Dive Modules:
- 🔬 Module A: Advanced Monitoring & Observability - Production monitoring
- 🏭 Module B: Enterprise Scaling Architecture - Scaling patterns
- ⚡ Module C: Performance Optimization - Optimization techniques
- 🔒 Module D: Security & Compliance - Security patterns
Next: Session 9 - Multi-Agent Patterns →