| Pre-training; full fine-tuning; LoRA / QLoRA | ✔ | ✔ |
| | |
| BF16, FP8, NVFP4; mixed-precision training | ✔ | ✔ |
| | |
| Multi-GPU; multi-node (Ray-based) | ✔ | ✔ |
| | |
| Smart CPU offloading | ✔ | ✔ |
| | |
| Native C++/CUDA engine; kernel fusions | ✔ | ✔ |
| | |
| Deterministic configs + predefined recipes | ✔ | ✔ |
| | |
| Dense + MoE model support | ✔ | ✔ |
| | |
| Broad NVIDIA SM coverage (sm80–sm121) | ✔ | ✔ |
| | |
| Autonomous agent runtime (skills, tools, sub-agents) | ✔ | ✔ |
| | |
| MCP server integration | ✔ | ✔ |
| | |
| Agent deployment on Kubernetes | ✔ | ✔ |
| | |
| Execution traces - basic viewer + export | ✔ | ✔ |
| | |
| Skill development + test suites | ✔ | ✔ |
| | |
| Data Hub with Git-style versioning | ✔ | ✔ |
| | |
| GPU-accelerated serving (vLLM) | ✔ | ✔ |
| | |
| Session replay, anomaly alerts, advanced dashboards | ✖ | ✔ |
| | |
| Auto CI benchmarking; leaderboard; regression guards | ✖ | ✔ |
| | |
| Skill improvement: GEPA optimization, failure analysis, A/B testing | ✖ | ✔ |
| | |
| Continuous agent improvement (scheduler + auto-promotion) | ✖ | ✔ |
| | |
| Full RLHF UI; preference datasets; feedback routing | ✖ | ✔ |
| | |
| Reinforcement learning: DPO / PPO / GRPO | ✖ | ✔ |
| | |
| Model distillation from agent trajectories (SLMs) | ✖ | ✔ |
| | |
| Synthetic data generation; reward function tooling | ✖ | ✔ |
| | |
| Fine-grained guardrails; content filters; compliance audit | ✖ | ✔ |
| | |
| Advanced RBAC; SSO; audit logs | ✖ | ✔ |
| | |
| Budget caps; per-team allocation; billing | ✖ | ✔ |
| | |
| Live training monitoring; GPU & node monitoring | ✖ | ✔ |
| | |
| Evaluation suite + red-teaming (bias/toxicity/PII/jailbreak) | ✖ | ✔ |
| | |
| Workload/container isolation | ✖ | ✔ |