2 hours ago

Data Quality Engineer

Remote

Middle

Colombia

SQLPythonQAdata pipelinesData AnalysisAnalytical Thinkingattention to detailBigQueryData ProductsData ValidationExchangeLinkedInmigrationPySparkspecifications

Job description

Overview

We are looking for a detail-oriented Data Quality Engineer with strong experience in data validation, SQL-based testing and cloud data environments. The ideal candidate is passionate about data quality, automation and analytical problem-solving and is comfortable working with complex datasets and large-scale data pipelines. This role requires someone who can validate data transformations, ensure accuracy across multiple systems and support migration and deployment activities within the data exchange ecosystem. The candidate should demonstrate strong analytical thinking, attention to detail and proactive communication skills while collaborating with distributed teams.

Responsibilities

Perform QA validation for SDAP Bulk data products within the Data Exchange ecosystem
Support CMAS (Match and Append) validation and ensure correct deployment into the Data Exchange pipeline
Provide QA support during Bobsled to Sledhouse migration, ensuring data integrity and functional correctness
Validate Sledhouse data products and data fulfillment processes to ensure accuracy and completeness
Execute data validation and comparison across multiple systems, including source input files to BigQuery tables, BigQuery table-to-table comparisons and verification of data mappings and transformations based on specifications
Utilize and expand the Core Quality Check framework built with PySpark scripts for automated data validation
Conduct data analysis and defect investigation, identifying root causes of issues in data pipelines or transformation logic
Collaborate with engineering and data teams to triage defects, validate fixes and ensure production readiness
Contribute to automation efforts for data validation and testing to improve efficiency and coverage
Communicate findings, risks and testing results clearly to stakeholders

Requirements

2+ years of experience in Data Quality Engineering
Strong knowledge of SQL, including working with complex joins, multiple tables and large datasets
Experience working with BigQuery or similar analytical databases
Familiarity with Google Cloud Platform (GCP) services
Expertise in data validation and automated data comparison techniques
Understanding of data transformation, validation and mapping verification using specifications
Capability to understand and work with PySpark-based quality frameworks
Strong data analysis and debugging skills, with the ability to identify defects in data processing pipelines
Excellent communication skills, both written and verbal
Upper-Intermediate English language proficiency (B2)

Nice to have

Proficiency in Python or PySpark development
Familiarity with AWS cloud services
Background in data engineering or data pipeline testing environments

[GTS] Benefits (generic, except India)

International projects with top brands
Work with global teams of highly skilled, diverse peers
Healthcare benefits
Employee financial programs
Paid time off and sick leave
Upskilling, reskilling and certification courses
Unlimited access to the LinkedIn Learning library and 22,000+ courses
Global career opportunities
Volunteer and community involvement opportunities
EPAM Employee Groups
Award-winning culture recognized by Glassdoor, Newsweek and LinkedIn

Good match

We match every vacancy against your profile and show a fit score — so you instantly know which ones are worth applying to. Sign up and create a resume — it's free.

Not enough data to estimate a salary range for this role in this region yet.