Copperco is hiring a Principal Site Reliability Engineer.
Key Responsibilities:
- Shape SRE: define reliability, observability, SLIs, SLOs and error budgets, and drive the adoption of SRE principles across the organization.
- Scale through automation: champion architectural improvements to enhance reliability and deployment velocity; build reusable platforms and frameworks; plan capacity and conduct production readiness reviews.
- Drive technical excellence: improve the lifecycle of microservices from inception through deployment, operation, observability and continuous refinement.
- Lead through influence: partner with engineering and product leadership to embed reliability, conduct blameless postmortems, drive systemic improvements in incident management and mentor engineers.
Essential skills & experience:
- Designing, analysing and troubleshooting distributed systems or microservices architectures
- Established expertise in observability and incident management
- Proven experience in driving organizational change
- Excellent communication skills and a systematic problem-solving approach
Desirable:
- Experience with production workloads in AWS
- Experience in financial services or regulated environments
- Interest in blockchain-based technologies and/or decentralised finance
- Master's degree in Computer Science or Engineering