As an Analytics Engineer at Salmon, you will play a pivotal role in Data Modeling & Transformation (Databricks Silver & Gold Layers). You will work closely with Data Scientists, Engineers, and Business System Analysts to ensure that datasets align with business needs.
Key responsibilities
Data Modeling & Transformation
Design, build, and maintain scalable data models in Databricks silver (curated data) and gold (business-ready data) layers.
Define clear data contracts between silver and gold to ensure consistency and reliability.
Apply best practices for dimensional modeling (star/snowflake schemas) to support analytics and reporting.
Collaboration & Best Practices
Partner with data scientists, platform engineers, and business analysts to ensure gold datasets meet business needs.
Follow software engineering practices — version control (Git), CI/CD for data pipelines, code reviews, and testing.
Contribute to the development of a shared analytics engineering framework (naming standards, reusable templates, testing frameworks).
ETL/ELT Development
Develop and optimize transformation pipelines (PySpark/SQL/Delta Live Tables/Databricks Workflows) to process data from bronze → silver → gold.
Implement incremental data processing strategies to minimize compute cost and improve pipeline performance.
Ensure data quality checks (validations, anomaly detection, deduplication, SCD handling, etc.) are built into transformations.
Data Quality & Governance
Establish and maintain data quality metrics (completeness, accuracy, timeliness) for silver and gold tables.
Apply data governance standards — consistent naming conventions, documentation, and tagging across datasets.
Collaborate with data platform engineers to enforce lineage and observability.
Business Enablement
Work closely with analysts and business stakeholders to understand requirements and translate them into gold-layer datasets.
Build reusable, business-friendly datasets that power dashboards, self-service BI tools, and advanced analytics.
Maintain documentation (data dictionaries, transformation logic, lineage diagrams).
Performance & Optimization
Optimize Databricks SQL queries and Delta Lake performance (Z-ordering, clustering, partitioning).
Monitor and tune workloads to control compute spend on silver and gold pipelines.
Implement best practices for caching, indexing, and incremental updates.
Requirements and expectations
Strong SQL expertise
Ability to write complex, performant queries (CTEs, window functions, joins)
Experience optimizing queries on large datasets
Strong understanding of analytical SQL patterns
Hands-on experience with dbt
Building and maintaining dbt models (staging, intermediate, marts)
Writing reusable macros and Jinja templates
Implementing tests, documentation, and exposures
Working with dbt version control and CI workflows
Data Modeling expertise
Strong understanding of dimensional modeling (facts, dimensions, star schemas)
Ability to translate business requirements into scalable data models
Designing metrics and semantic layers for analytics and BI
Experience maintaining a single source of truth for business metrics
Analytics Engineering mindset
Strong focus on data quality, reliability, and consistency
Experience working closely with analysts and business stakeholders
Ability to balance technical best practices with business needs
Production-ready analytics
Experience with data testing, monitoring, and debugging
Familiarity with ELT pipelines and modern data stack concepts
Comfortable working in Git-based workflows