Join a team revolutionizing AI with data center scale solutions as a Senior Solutions Architect, AI Compute. Deploy, manage, and validate AI Compute/HPC infrastructure for new and existing customers. Be a domain expert and provide feedback to internal and partner teams.

Requirements

  • 8+ years providing in-depth support and deployment services
  • Linux system administration, process and package management, task scheduling, kernel management, boot procedures/fixing, performance reporting/optimization/logging, network-routing/advanced networking
  • Cluster management and provisioning technologies for bare-metal servers
  • Scripting proficiency (Bash, Python, Ansible, etc.)
  • Experience with schedulers such as SLURM, LSF, UGE, etc.
  • Ability to travel up to 30% of the time
  • Experience with benchmarking tools such as HPL, NCCL tests, MLPerf as well as Kubernetes

Benefits

  • Equity
  • Benefits