1 day ago

Lead Kernel Engineer/Architect

Hybrid

Middle

Rijswijk; Amsterdam; Netherlands

ArchitectPythonSolution ArchitectureMachine LearningAPIsC++distributed traininginferenceJAXKernelLearning and DevelopmentMobilenvidia tritonOptimizationperformance optimization

Job description

Overview

We're looking for a Lead Kernel Engineer/Architect to join our team in the Netherlands in a hybrid working mode.

Are you passionate about pushing advanced hardware accelerators to their limits? Join us in shaping the future of AI performance and scalability.

As a Lead Kernel Engineer/Architect, you will drive the optimization of critical machine learning operations for large-scale training and inference, working with cutting-edge hardware like TPUs and GPUs, advanced ML models and performance toolchains. Your work will enable faster AI research and production deployments on cloud platforms and within open-source ecosystems.

In this role, you will collaborate with researchers, compiler engineers and framework developers to deliver optimized, high-performance solutions that set the standard for modern AI computation.

Responsibilities

Design and optimize high-performance kernels for TPU and GPU architectures using low-level programming frameworks such as Pallas, Triton or Mosaic
Build and maintain performance infrastructure, including benchmarking suites, autotuning systems, regression testing frameworks and tooling
Collaborate with ML framework developers (e.g., JAX, PyTorch) and compiler teams (XLA/MLIR) to integrate custom kernels and reduce performance bottlenecks
Track advancements in accelerator hardware, compiler technology and AI model design to identify opportunities for kernel-level optimization
Develop clear documentation, APIs and supporting OSS components that improve developer usability and adoption
Analyze and resolve complex performance issues impacting large-scale distributed training and inference systems

Requirements

Bachelor’s degree or equivalent practical experience
12+ years of industry experience in software engineering or systems programming
5+ years of experience in software development using C++ or Python
3+ years of experience in testing, maintaining or launching software products and at least 1 year in software design or architecture
Hands-on experience in performance optimization at the kernel level for accelerators or high-performance systems

Nice to have

Proficiency in low-level accelerator programming (CUDA, Triton, Pallas)
Familiarity with ML frameworks such as JAX or PyTorch and optimization techniques for attention layers, Mixture of Experts (MoE) and precision tuning
Strong understanding of modern hardware accelerators, including pipelining, data movement and heterogeneous compute
Knowledge of compiler principles and intermediate representations (e.g., MLIR, OpenXLA)
Experience building OSS developer infrastructure, APIs and performance-critical libraries
Excellent problem-solving skills and ability to collaborate in cross-functional engineering environments

Netherlands

26 paid holiday days
Pension plan scheme
Disability insurance (WGA Shortfall insurance)
Long-term disability insurance (WIA Top up insurance)
EPAM Employee Stock Purchase Plan (ESPP)
Commuting to work - costs reimbursement
Laptop + corporate simcard + corporate mobile device (subject to certain eligibility requirements)
Bike lease
Employee Assistance Program
Corporate Programs including Employee Referral Program with rewards
Learning and development opportunities including in-house training and coaching, professional certifications, and courses

*All benefits and perks are subject to certain eligibility requirements

Match

Good match

We match every vacancy against your profile and show a fit score — so you instantly know which ones are worth applying to. Sign up and create a resume — it's free.

Not enough data to estimate a salary range for this role in this region yet.