prose

reflections on OSS, HPC, and Ai/ML engineering, with occasional considerations on Cognitive Neuroscience

Home

About

4am Dev Box - GPU & RDMA Hardware Shuffle

May 13, 2025

1 minute to read

Photography

Some photos from a bit of 4am dev box maintenance earlier this morning, focused on hardware reallocation and consolidation to maximize PCIe lane utilization.

This box will host one VM using full PCIe passthrough on the GPUs and 50% NPAR for the NIC ports, and four smaller VMs. Storage for the VMs will be host-level ZFS on the two P3608, split up as several zvol allocated to the smaller VMs for a RoCE-v2 storage micro-cluster, replication to off-host network based storage also using RDMA.

The GPU compute VM will be running our agentic-consensus LLM models, continuing development into deeper waters prior to running on the bigger hosts in colocation.

4x Ampere A4000 GPUs
2x 4TB Optane P3608 NVMe (2x 2TB per card)
4x 128GB (512GB) Optane NVDIMMs + volatile 128GB of RDIMMs
64 Thread Xeon 8370C, 2.8Ghz (Hyperscaler SKU)
4-port 25GbE QLogic NIC, QL41234, using RoCE v2 RDMA

#nvidia #gpu #ai #engineering #llm #development

prose

reflections on OSS, HPC, and Ai/ML engineering, with occasional considerations on Cognitive Neuroscience

prose

reflections on OSS, HPC, and Ai/ML engineering, with occasional considerations on Cognitive Neuroscience

4am Dev Box - GPU & RDMA Hardware Shuffle

Categories

Photography

Emotive

Logistics

Musings

Engineering

Industry

Linux

Neurology

FreeBSD

Philosophy

Photos

Reference

Travel

Archives