Health and readiness
/healthLiveness check for the process./readyzDependency readiness for auth, public URL, signing, storage, providers, WebRTC, SIP, and FreeSWITCH mode./setup/requirementsNon-secret provisioning checklist for the dashboard and operator setup flows.Metrics and SLOs
Metrics should answer whether the runtime is meeting its voice and webhook objectives before a customer hears the failure.
/metricsPrometheus text metrics./metrics.jsonDashboard-ready runtime counters./analytics/summaryClickHouse-backed summary when analytics are enabled./sloVoice pipeline p95, webhook success rate, API p95, and audio warning snapshot./runbooksOperator runbook definitions exposed by the runtime.Call investigation
Every call should be inspected through the timeline first. It combines logs, messages, webhook attempts, transcript state, artifacts, warning counts, and error counts.
/call/:id/timelineSupport timeline for a single call./call/:id/logsCall-specific runtime logs./call/:id/voice-pipelineVoice pipeline state and observed media/STT/TTS details./call/:id/costsProvider cost breakdown for the ended call.Support bundle
The operator CLI should package environment, readiness, recent logs, license state, and deployment metadata while redacting secrets and sensitive payloads.
aywa support-bundle --redact
Common incident checks
High latency
Check first STT partial, final user turn, first LLM token, first TTS audio, and tool roundtrip spans.
Webhook failures
Open /logs/webhooks, inspect retry metadata, then replay or requeue from the stored attempt.
SIP audio issues
Inspect NAT diagnostics, SDP codec compatibility, RTP packet flow, and PCAP availability.
Provider outage
Confirm provider readiness in /readyz, fallback behavior, and failure logs.