About the Role

We're seeking a Senior Infrastructure Engineer to help build and scale Hyperbolic's GPU Cloud Marketplace, building a multi-tenancy provisioning and virtualization solution. You'll transform raw GPUs from diverse global suppliers into a programmable, orchestrated pool that serves thousands of AI developers and researchers.

Requirements

  • Experience with bare-metal provisioning and lifecycle management (e.g., IPMI/Redfish, BMC, PXE, OS deployment)
  • Experience with GPU scheduling and orchestration
  • Experience with infrastructure and DevOps tools (e.g., Terraform or Pulumi, CI/CD, secrets management, configuration management, observability tools)
  • Experience with storage and data infrastructure for AI/ML workloads (e.g., object storage, block storage, distributed file systems)
  • Experience with API design and cloud-init
  • Experience with GPU architecture, CUDA, and GPU compute
  • Experience working with hardware vendors or vendor engineering teams
  • Experience building and scaling cloud infrastructure or distributed systems in production environments

Bonus Skills

  • Familiarity with high-performance networking technologies such as InfiniBand and RoCE
  • Experience with distributed storage systems such as Ceph, Weka, or VAST Data