This is a remote position.
Reporting to Head of Engineering, owns the production support operating model, incident management, service levels, release-watch support, escalation governance, and overall live-service quality across pods. This role ensures Production Support is proactive, disciplined, and tightly connected to Engineering, Product, QA, and business stakeholders.
Key Responsibilities
Service Reliability Leadership
·Define the production support operating model, incident lifecycle, severity framework, and support expectations across pods.
·Establish service levels, escalation paths, release-watch routines, and communication standards for production issues.
·Create visibility into live-service health, incident trends, and recurring support risks.
Team & Process Leadership
·Lead production support engineers, AI operations analysts, and the finance-domain support lead.
·Build repeatable support routines, runbook discipline, and issue triage practices that scale across products and workflows.
·Partner with Product, Engineering, and QA to improve supportability and reduce recurring production issues.
Incident & Stakeholder Management
·Own incident governance for material issues, including severity calls, stakeholder updates, escalation management, and stabilization plans.
·Ensure business and technical stakeholders have clear visibility into impact, next steps, and resolution progress.
·Drive post-incident review practices that improve resilience and reduce repeat failures.
·Ensure finance-sensitive workflows receive the right level of production support, issue classification, and escalation handling.
·Oversee support patterns for AI-enabled workflows, including degraded outputs, fallback scenarios, trust issues, and human-review triggers.
·Work with business and technical teams to distinguish software defects from data, process, training, or model-behavior issues.
Requirements
Required Qualifications-
·8+ years of production support, application support, service reliability, or engineering operations experience, including team leadership.
·Strong knowledge of incident management, service operations, support processes, release support, and escalation discipline.
·Experience working closely with engineering and product teams in modern software delivery environments.
·Ability to communicate clearly with both technical and business stakeholders during high-pressure situations.
·Comfort operating in finance-sensitive, workflow-heavy, or business-critical application environments.
·Bachelor's degree preferred.
You Are
·Structured, pragmatic, and highly credible.
·Calm under pressure and comfortable making judgment calls with incomplete information.
·A builder of reliable operating processes, not just a responder to tickets.
·Focused on trust, transparency, and service quality.
Benefits
Salary plus performance-based bonus.
Actual compensation packages are determined by evaluating a wide array of factors unique to each candidate, including but not limited to skill set, years and depth of experience, education, certifications, cost of labor, and internal equity.
