Overview

We are seeking a remote Senior Data Software Engineer to join our team.

This position requires a highly skilled professional with expertise in Apache Spark, Microsoft Azure, and Python. You will work on a high-impact project for a globally recognized brand, contributing to the development of innovative data solutions that support business growth and success.

Responsibilities

  • Collaborate with cross-functional teams to design, implement, and maintain data integration systems
  • Build and optimize data pipelines leveraging Apache Spark, Microsoft Azure, and Python
  • Ensure data pipelines are scalable, reliable, and maintainable
  • Convert data science models into production-ready applications
  • Develop and support forecasting models to inform strategic business decisions
  • Maintain data quality and consistency across all data sources
  • Monitor and enhance data pipelines to ensure efficient data processing
  • Work with data scientists to create and deploy machine learning models
  • Continuously refine and improve data integration solutions to meet changing business requirements
  • Stay informed about emerging trends and technologies in data software engineering

Requirements

  • A minimum of 3 years of experience in data software engineering or a related role
  • Advanced proficiency in Apache Spark, Microsoft Azure, and Python
  • Deep understanding of forecasting models, MLOps practices, and data science concepts
  • Hands-on experience with Databricks for building and optimizing data pipelines
  • Proven ability to deploy data science models in production environments
  • Proficiency in using Git for version control
  • Strong knowledge of Azure concepts, including clouds, regions, ADLS, and compute services
  • Exceptional analytical and problem-solving skills, with the ability to think critically and innovate solutions
  • Experience working in Agile development environments
  • Strong written and verbal English communication skills at a B2+ level

Nice to have

  • Experience with Panda for data manipulation and analysis
  • In-depth knowledge of SQL and relational database systems
  • Understanding of statistical models and experience developing them using Python or Spark
  • Familiarity with machine learning tools and technologies

[GTS] Benefits (generic, except India)

  • International projects with top brands
  • Work with global teams of highly skilled, diverse peers
  • Healthcare benefits
  • Employee financial programs
  • Paid time off and sick leave
  • Upskilling, reskilling and certification courses
  • Unlimited access to the LinkedIn Learning library and 22,000+ courses
  • Global career opportunities
  • Volunteer and community involvement opportunities
  • EPAM Employee Groups
  • Award-winning culture recognized by Glassdoor, Newsweek and LinkedIn