5+ years of experience in data engineering, ML pipelines, or distributed systems.

Strong experience building scalable data pipelines for large datasets (video/audio preferred).

Hands-on experience with cloud platforms (AWS, Azure, or GCP).

Experience working with GPU-based environments and distributed computing.

Strong programming skills in Python, Scala, or similar languages.

Experience with data processing frameworks (Spark, Ray, Kafka, Airflow, or similar).

Understanding of ML workflows, training pipelines, and inference systems.

Experience designing fault-tolerant, high-availability systems.

Strong knowledge of data storage systems (data lakes, object storage, distributed file systems).

Ability to handle high-throughput, large-scale data ingestion and processing.

ML / AI Data Engineer (Contract)

Job description