BotBeat
...
← Back

> ▌

ZMLZML
PRODUCT LAUNCHZML2026-04-02

ZML Releases Universal Diagnostic Tool for GPUs, TPUs, and NPUs Across All Major Platforms

Key Takeaways

  • ▸zml-smi provides unified monitoring across NVIDIA, AMD, Google TPU, and AWS Trainium devices with a single interface
  • ▸The tool offers comprehensive metrics including GPU utilization, temperature, power draw, memory usage, and process-level resource consumption
  • ▸zml-smi uses creative sandboxing techniques to support the latest AMD GPU models without requiring system-level installations or library patches
Source:
Hacker Newshttps://zml.ai/posts/zml-smi/↗

Summary

ZML has launched zml-smi, a universal diagnostic and monitoring tool designed to provide real-time performance insights across multiple AI hardware platforms including NVIDIA GPUs, AMD GPUs, Google TPUs, and AWS Trainium devices. The tool combines functionality similar to nvidia-smi and nvtop, offering comprehensive hardware monitoring capabilities without requiring additional software beyond device drivers and GLIBC.

zml-smi displays an extensive range of metrics including GPU utilization, temperature, power draw, memory usage, and process-level resource consumption. The tool uses platform-specific libraries—NVML for NVIDIA, AMD SMI for AMD, gRPC for Google TPU, and private APIs for AWS Trainium—to gather accurate performance data. A key innovation is its ability to recognize the latest AMD GPU models by dynamically merging GPU identification files from both Mesa and ROCm at build time, ensuring support for cutting-edge hardware like the Ryzen AI Max+ 395.

The tool is available for download as a self-contained binary that works across different hardware configurations. zml-smi also provides host-level metrics such as CPU model, memory usage, and process details with full cross-platform compatibility, making it a significant step toward unified hardware monitoring in the increasingly diverse AI accelerator landscape.

  • Designed as a lightweight, self-contained binary that requires minimal dependencies beyond device drivers and GLIBC

Editorial Opinion

The release of zml-smi addresses a growing pain point in the AI hardware ecosystem: the fragmentation of monitoring tools across different accelerator vendors. As organizations increasingly adopt diverse hardware accelerators, having a unified diagnostic tool that works across NVIDIA, AMD, Google, and AWS platforms significantly improves operational efficiency. The technical implementation, particularly the clever sandboxing approach for AMD GPU support, demonstrates thoughtful engineering that balances compatibility with maintainability.

MLOps & InfrastructureAI HardwareOpen Source

More from ZML

ZMLZML
PRODUCT LAUNCH

zml-smi: Universal GPU, TPU, and NPU Monitoring Tool Now Available

2026-03-31

Comments

Suggested

Google / AlphabetGoogle / Alphabet
RESEARCH

Deep Dive: Optimizing Sharded Matrix Multiplication on TPU with Pallas

2026-04-05
GitHubGitHub
PRODUCT LAUNCH

GitHub Launches Squad: Open Source Multi-Agent AI Framework to Simplify Complex Workflows

2026-04-05
NVIDIANVIDIA
RESEARCH

Nvidia Pivots to Optical Interconnects as Copper Hits Physical Limits, Plans 1,000+ GPU Systems by 2028

2026-04-05
← Back to news
© 2026 BotBeat
AboutPrivacy PolicyTerms of ServiceContact Us