Skip to content

Session 9 - Module B: Test Solutions

Production Multi-Agent Systems Deployment and Monitoring


Question 1: Metrics Storage Component

Correct Answer: B) metrics_buffer

Explanation: The metrics_buffer is a deque (double-ended queue) with a maximum length of 10,000 entries that stores recently collected metrics for quick access. This circular buffer allows the system to maintain a sliding window of recent metrics while automatically discarding older entries when the buffer reaches capacity. The other components serve different purposes: agent_registry tracks registered agents, alert_handlers manage alert notifications, and collection_tasks store asyncio tasks for metric collection.


Question 2: HorizontalPodAutoscaler Purpose

Correct Answer: B) To automatically scale agent replicas based on resource utilization

Explanation: The HorizontalPodAutoscaler (HPA) monitors resource metrics like CPU and memory utilization and automatically adjusts the number of pod replicas to maintain target utilization levels. In the implementation, it scales between minReplicas and maxReplicas based on configured thresholds (e.g., 70% CPU, 80% memory). This ensures optimal resource usage and system responsiveness under varying loads. Network routing is handled by Services and Ingress, service discovery by DNS, and authentication by separate security mechanisms.


Question 3: Istio PeerAuthentication Purpose

Correct Answer: C) To enforce encrypted communication between all agent services

Explanation: PeerAuthentication with STRICT mTLS mode enforces mutual Transport Layer Security for all service-to-service communication within the multi-agent system. This means every connection between agent services is automatically encrypted and both endpoints must present valid certificates. This provides zero-trust security where no communication is allowed without proper authentication and encryption, preventing eavesdropping and man-in-the-middle attacks between agents.


Question 4: Distributed Tracing Span Definition

Correct Answer: B) A single operation or task performed by an individual agent

Explanation: In distributed tracing, a span represents a single unit of work or operation performed by an agent, such as processing a request, calling another service, or executing a specific function. Each span has timing information (start/end times), metadata, and status. Multiple spans form a trace that tracks a request's journey across multiple agents. The span includes the agent_id, operation name, duration, and any relevant labels or logs that help with debugging and performance analysis.


Question 5: Kubernetes Service Exposure

Correct Answer: C) Service

Explanation: A Kubernetes Service is the resource that exposes agent pods to other services within the cluster by providing a stable network endpoint and load balancing. The Service uses selectors to identify target pods and creates a virtual IP (ClusterIP) that routes traffic to healthy pod instances. Deployments manage pod lifecycle, ConfigMaps store configuration data, and Secrets store sensitive information, but none of these directly expose network access to pods.


Question 6: DestinationRule Outlier Detection

Correct Answer: B) To automatically remove unhealthy agent instances from load balancing

Explanation: Outlier detection in Istio's DestinationRule automatically identifies and temporarily removes unhealthy service instances from the load balancing pool. The configuration specifies thresholds like consecutiveErrors: 5 (remove after 5 consecutive failures) and baseEjectionTime: 30s (keep removed for 30 seconds). This circuit breaker pattern prevents requests from being sent to failing instances, improving overall system reliability and user experience by routing traffic only to healthy agents.


Question 7: Agent Health Score Memory Impact

Correct Answer: B) The health score is reduced by 25 points

Explanation: According to the _calculate_agent_health_score method, when an agent's average memory usage exceeds 85%, the health score is reduced by 25 points. The algorithm uses tiered penalties: memory usage > 70% reduces score by 10 points, while > 85% reduces it by 25 points (critical memory usage). This graduated penalty system allows for early warning (70-85% range) and critical alerting (>85% range) while maintaining a quantitative health assessment that operations teams can use for decision-making.


Implementation Code References:

# Health scoring memory logic
if memory_metrics:
    avg_memory = sum(memory_metrics) / len(memory_metrics)
    if avg_memory > 85:
        health_score -= 25  # Critical memory usage
    elif avg_memory > 70:
        health_score -= 10  # High memory usage

← Back to Module B