Overview
We are building a Lead AI Engineer role to expand a GenAI Platform that lets product teams ship AI agents with consistent safety and quality standards.
You will drive agents and platform features from concept to production, improve secure SDLC and privacy-by-design, and raise automation and reliability across the stack. Apply now to help deliver dependable LLM-driven systems used across industries
Responsibilities
- Ship production-ready applications, AI agents, and platform capabilities that accelerate GenAI agent development, validation, and release cycles
- Build developer utilities, streamline CI/CD, and improve observability with evaluation tooling, canary releases, rollback paths, plus cost and quality metric tracking
- Apply secure SDLC and privacy-first engineering by running threat assessments and enforcing least privilege controls
- Collaborate with product leads, UX professionals, and domain specialists to deliver solutions anchored in user needs and measurable impact
- Use advanced LLM patterns including retrieval-augmented generation, smart routing, tool integration, and evaluation to increase reliability, reduce latency, strengthen trust and safety, and lower operating costs
- Lead the team technically through architecture influence, mentoring, and oversight of high-impact projects
Requirements
- At least 5 years of software engineering experience in professional roles
- Proven ability to ship software products independently or in small agile teams
- Hands-on delivery of AI agents from initial concept through deployment, including safety checks, A/B experiments, and iterative performance tuning
- Familiarity with LangChain or LangGraph, MCP, vector databases, and OpenSearch
- Experience with foundational machine learning workflows covering training, deployment, and monitoring
- Understanding of compliance frameworks and regulations such as SOC2 and HIPAA
- Strong ownership, problem-solving ability, and effective communication with partner teams
- Advanced full-stack engineering skills and exposure to AWS, Azure, or GCP
- Experience automating CI/CD pipelines and using Infrastructure as Code
- Knowledge of Site Reliability Engineering practices and operational readiness
- Background in quality assurance practices and test strategy design
- Understanding of secure development lifecycle approaches and privacy-by-design principles
- Strong TypeScript proficiency
- Advanced English level (B2+), both written and spoken
Nice to have
- Knowledge of large language model architecture, common failure patterns, and approaches like fine-tuning and model adaptation
- Experience designing and delivering Retrieval-Augmented Generation (RAG) solutions
[GTS] Benefits (generic, except India)
- International projects with top brands
- Work with global teams of highly skilled, diverse peers
- Healthcare benefits
- Employee financial programs
- Paid time off and sick leave
- Upskilling, reskilling and certification courses
- Unlimited access to the LinkedIn Learning library and 22,000+ courses
- Global career opportunities
- Volunteer and community involvement opportunities
- EPAM Employee Groups
- Award-winning culture recognized by Glassdoor, Newsweek and LinkedIn