Aria Operations Custom Dashboards - Multi-Cluster Performance Correlation


Hybrid operations span on-premises infrastructure and cloud endpoints, requiring unified monitoring and management capabilities. This KB article addresses operational challenges in maintaining visibility and control across distributed infrastructure while managing the complexity of multiple management planes and monitoring signal sources.
Source KB: https://knowledge.broadcom.com/external/article/aria-ops-dashboards
KB Number: aria-ops-dash
Orchestrator Integration: Automation Workflow
Goal: Automate aria operations custom dashboards - multi-cluster performance correlation configuration and validation to reduce manual effort and ensure consistency across environments.
Workflow steps (VMware Aria Orchestrator)
• Create a workflow: 'Aria Operations Multi-Cluster Performance Dashboard Automation'
* Inputs: vcfDomains (array), dashboardName (string), metricSelections (array), refreshInterval (integer: 300 seconds)
* Step 1: Authenticate to Aria Operations REST API - retrieve auth token for dashboard management operations
* Step 2: Query infrastructure inventory across all specified VCF domains:
- Enumerate vCenter Servers, clusters, hosts, datastores, VMs
- Collect NSX managers, edges, segments
- Identify vSAN clusters and storage policies
- Map relationships between compute and storage resources
* Step 3: Create custom dashboard object via API POST /suite-api/api/dashboards with:
- Dashboard name, description, and ownership
- Layout grid configuration (widget positioning)
- Refresh interval and time range settings
* Step 4: Build multi-cluster performance correlation widgets:
- CPU utilization heatmap across all clusters (identify hotspots)
- Memory contention analysis widget (shows balloon, swap, compress)
- Storage latency trending (vSAN vs. traditional datastores)
- Network throughput correlation (underlay and overlay)
- VM performance distribution (identify outliers across clusters)
* Step 5: Create capacity planning widgets:
- Time remaining analysis (when clusters reach 80% capacity)
- Growth trend projections (linear and exponential models)
- What-if scenario modeling (add X workloads, impact on capacity)
- Cost projection based on current consumption rates
* Step 6: Add operational health widgets:
- Active alerts by severity across all domains
- Compliance status (configuration drift, security policies)
- Backup status (protected VMs vs. unprotected)
- Patch compliance (ESXi, vCenter, NSX versions)
* Step 7: Implement cross-cluster correlation logic:
- Identify performance patterns spanning multiple clusters (shared storage bottleneck)
- Detect cascade failure risks (one cluster impacts others via shared infrastructure)
- Highlight resource imbalances (cluster A over-utilized while cluster B idle)
* Step 8: Configure automated alerting thresholds:
- CPU sustained above 85% for 15 minutes → P2 alert
- Memory contention detected → P3 alert
- Storage latency above 20ms → P2 alert
- Capacity under 60 days remaining → P3 alert
* Step 9: Create executive summary view:
- Infrastructure health score (composite metric)
- Cost per VM trending
- Efficiency metrics (utilization vs. capacity)
- Top 10 resource consumers
* Step 10: Schedule automated dashboard distribution:
- PDF export every Monday 8 AM → email to operations team
- Real-time dashboard URL shared with stakeholders
- Integration with Slack for critical alert notifications
* Step 11: Implement dashboard version control:
- Export dashboard JSON configuration to Git repository
- Track changes over time
- Enable rollback to previous dashboard versions
Expected outcome
Unified multi-cluster performance dashboard provides single-pane-of-glass visibility eliminating manual metric collection across 5-10 vCenter instances, capacity planning projections prevent surprise resource exhaustion, cross-cluster correlation identifies shared infrastructure bottlenecks 60% faster than manual analysis.



