Multi-Vendor GPU Monitoring
Always-on monitoring for NVIDIA CUDA and AMD ROCm with minimal overhead. Lock-free ring buffers and background collection keep your hot path fast.
Scoping & Reporting
Group kernels into application phases with GFL_SCOPE. Generate performance reports with kernel hotspots, occupancy analysis, and system metrics.
Profiling & ISA Analysis
PC sampling, SASS/ISA instruction metrics, and automatic GPU assembly disassembly. Python tools for analysis, dashboards, and timeline visualization.