Join a team revolutionizing AI with data center scale solutions as a Senior Solutions Architect, AI Compute. Deploy, manage, and validate AI Compute/HPC infrastructure for new and existing customers. Be a domain expert and provide feedback to internal and partner teams.
Requirements
- 8+ years providing in-depth support and deployment services
- Linux system administration, process and package management, task scheduling, kernel management, boot procedures/fixing, performance reporting/optimization/logging, network-routing/advanced networking
- Cluster management and provisioning technologies for bare-metal servers
- Scripting proficiency (Bash, Python, Ansible, etc.)
- Experience with schedulers such as SLURM, LSF, UGE, etc.
- Ability to travel up to 30% of the time
- Experience with benchmarking tools such as HPL, NCCL tests, MLPerf as well as Kubernetes
Benefits
- Equity
- Benefits
