Some photos from a bit of 4am dev box maintenance earlier this morning, focused on hardware reallocation and consolidation to maximize PCIe lane utilization.
This box will host one VM using full PCIe passthrough on the GPUs and 50% NPAR for the NIC ports, and four smaller VMs. Storage for the VMs will be host-level ZFS on the two P3608, split up as several zvol allocated to the smaller VMs for a RoCE-v2 storage micro-cluster, replication to off-host network based storage also using RDMA.
The GPU compute VM will be running our agentic-consensus LLM models, continuing development into deeper waters prior to running on the bigger hosts in colocation.
- 4x Ampere A4000 GPUs
- 2x 4TB Optane P3608 NVMe (2x 2TB per card)
- 4x 128GB (512GB) Optane NVDIMMs + volatile 128GB of RDIMMs
- 64 Thread Xeon 8370C, 2.8Ghz (Hyperscaler SKU)
- 4-port 25GbE QLogic NIC, QL41234, using RoCE v2 RDMA
#nvidia #gpu #ai #engineering #llm #development