Overview

We are seeking a Senior Backend Developer to join our AI Pods team and build robust AI-powered API services and agentic workflows. This role focuses on language model interaction with internal systems, vector databases and external tools via orchestration frameworks. You will treat LLMs as components in a secure, scalable distributed system, owning API contracts and service reliability end-to-end while integrating LLMs using rigorous systems engineering practices including evaluation, observability, fallback handling and cost awareness.

Responsibilities

  • Design and build backend API services and LLM orchestration layers
  • Implementation and maintenance of complex RAG pipelines including document ingestion, chunking, embedding and retrieval optimization
  • Develop and connect agent tools using LangChain, LangGraph and potentially MCP (Model Context Protocol)
  • Ensure security, privacy, enterprise-grade observability and test coverage for all backend workflows
  • Contribute to architecture decisions and engineering standards within the pod
  • Collaboration with frontend engineers, data engineers and infrastructure teams
  • Ownership of API contracts and service reliability, ensuring AI edge cases fail gracefully
  • Build reliable, reusable orchestration frameworks and logic for downstream developer consumption

Requirements

  • 3+ years of backend engineering experience focused on microservices and distributed systems
  • 2+ years of proficiency in Python for high-performance backend services and cloud-native APIs
  • Expertise in AWS, Docker and ECS/EKS
  • Background in RESTful APIs
  • Knowledge of secure coding practices and strong auth/authz fundamentals
  • Fluent communication skills in English at a B2+ level

Nice to have

  • 1–2+ years of hands-on experience with AI SDKs such as OpenAI, Anthropic/Claude or AWS Bedrock in production
  • Familiarity with vector databases (Amazon Kendra, OpenSearch), embedding strategies and retrieval systems
  • Hands-on experience with agentic frameworks such as LangChain, LlamaIndex or LangGraph
  • Skills in AI evaluation tooling or real-time APM (LangSmith, Langfuse, Arize)
  • 1–2+ years of React and TypeScript, experience deploying to EKS environments at scale
  • Understanding of agent interoperability patterns (MCP), identity/security domains (IAM, CIAM) and secondary languages such as Java, Node.js or Go

[GTS] Benefits (generic, except India)

  • International projects with top brands
  • Work with global teams of highly skilled, diverse peers
  • Healthcare benefits
  • Employee financial programs
  • Paid time off and sick leave
  • Upskilling, reskilling and certification courses
  • Unlimited access to the LinkedIn Learning library and 22,000+ courses
  • Global career opportunities
  • Volunteer and community involvement opportunities
  • EPAM Employee Groups
  • Award-winning culture recognized by Glassdoor, Newsweek and LinkedIn