Overview

Our client is developing an AI-powered analytics layer on top of a network security data platform.

As a Lead Data Engineer on this engagement, you will design and deliver the core data infrastructure aimed at deriving security policies from operational technology network logs.

You will work with diverse, complex data exports from industrial security platforms — such as network monitoring systems — and transform them into a well-structured DynamoDB database. Once the data foundation is in place, you will connect it to AWS QuickSight to deliver interactive BI dashboards that give stakeholders immediate visibility into network behavior, security events, and policy gaps.

Responsibilities

  • Develop and configure DynamoDB databases to serve as the central repository for network and security log information
  • Establish table layouts, partition keys, sort keys, and Global Secondary Indexes to support large-scale, time-series event data
  • Enhance database efficiency and manage costs by utilizing DynamoDB features and implementing data archiving solutions
  • Construct reliable ETL pipelines to import data from Excel files, CSVs, and API sources from network monitoring systems
  • Clean, validate, standardize, and enrich security-event data, addressing schema inconsistencies and missing fields
  • Set up automated orchestration, scheduling, and error handling to guarantee consistent pipeline operation and timely data availability
  • Integrate DynamoDB with AWS QuickSight to build interactive dashboards and reports highlighting security events and network activity
  • Create and update QuickSight datasets, calculated fields, and visualizations, refining dashboard designs based on stakeholder input
  • Work alongside cybersecurity specialists, AI/Analytics Engineers, AI Architects, and other stakeholders to convert business needs into data architecture solutions
  • Prepare comprehensive technical documentation including schema layouts, data dictionaries, pipeline details, and operational guides

Requirements

  • At least 5 years of experience in data engineering positions
  • Minimum one year of experience leading and managing development teams
  • In-depth expertise with DynamoDB, including table design, partition and sort key strategies, GSIs, capacity planning, Streams, and NoSQL modeling for high-volume workloads
  • Demonstrated ability to build robust ETL/ELT pipelines in Python using tools like boto3 and pandas, capable of processing diverse, multi-format source data and loading it into DynamoDB
  • Skilled in AWS services such as S3, Lambda, IAM, and CloudWatch, and their integration with DynamoDB and QuickSight
  • Experience using AWS QuickSight to develop datasets, analyses, and interactive dashboards, connecting to AWS data sources and creating visualizations for a broad audience
  • Proficient in NoSQL data modeling, designing denormalized, query-driven structures for key-value and document-oriented data
  • Strong foundation in data quality engineering, including validation frameworks, data contracts, and automated pipeline testing
  • Advanced Python skills for writing maintainable ETL code, with familiarity in virtual environments, testing, and version control using Git
  • Ability to clearly document technical decisions and work collaboratively with non-technical stakeholders
  • Excellent English communication skills at B2+ level or higher, both written and spoken

Nice to have

  • Experience with AWS Glue and Athena for serverless ETL and querying S3-based data lakes
  • Knowledge of graph databases like Neo4j or Amazon Neptune and modeling data as nodes and relationships for network topology and policy representation
  • Familiarity with CI/CD tools such as GitHub Actions or GitLab CI for automated pipeline testing and deployment
  • Previous work with network-security logs, cybersecurity analytics, or security-focused environments
  • Understanding of process mining, including event log structures and their use in process-mining algorithms
  • Experience with AI-assisted development tools or methodologies

[GTS] Benefits (generic, except India)

  • International projects with top brands
  • Work with global teams of highly skilled, diverse peers
  • Healthcare benefits
  • Employee financial programs
  • Paid time off and sick leave
  • Upskilling, reskilling and certification courses
  • Unlimited access to the LinkedIn Learning library and 22,000+ courses
  • Global career opportunities
  • Volunteer and community involvement opportunities
  • EPAM Employee Groups
  • Award-winning culture recognized by Glassdoor, Newsweek and LinkedIn