Bright Machines is looking for a Senior Platform/MLOps Engineer to build scalable systems that are foundational to the Bright Machines technology stack. The successful candidate will design, implement, and maintain reliable, scalable, and secure infrastructure, applications, and tooling, with a focus on ML/AI pipelines and workloads.

Requirements

  • At least 5+ years of experience in Platform Engineering, DevOps, or Site Reliability Engineering (SRE)
  • B.S. or M.S. degree (or equivalent) in Computer Science, Engineering, or a related field
  • Proficiency in at least one modern programming languages (Python, Javascript, C#, Go, etc)
  • Demonstrated industry best-practices in MLOps
  • Proficiency with CI/CD tools and GitOps workflows
  • Familiarity with running GPU workloads in kubernetes
  • Strong knowledge of Kubernetes (self-hosted and managed) and modern k8s paradigms (e.g. CNCF)
  • Proficiency with Infrastructure as Code tools (Terraform, etc) and configuration management tools (Ansible, etc)
  • Familiarity with observability stacks (Prometheus, Grafana, OpenTelemetry)

Benefits

  • Generous salary range ($150,000 - $170,000 a year)
  • Opportunity to make lasting, impactful changes for the company and customers