Company: Bloxstaking
Position: Senior Site Reliability Engineer
Responsibilities:
- Design and implement infrastructure and tools that empower product teams to rapidly and securely iterate, emphasizing reliability and automation.
- Influence the strategic direction of infrastructure and operational practices, ensuring the organization is well-positioned to scale and support growth.
- Take a proactive role in the resolution of production issues, ensuring preparedness to handle incidents and enabling blameless learning from them.
- Work closely with product teams on production deployments, release management, and incident handling, aiming for seamless operations.
- Offer technical expertise and input to support continual adoption and modernization of the platform and infrastructure.
- Build and deploy AI-powered tooling (autonomous coding agents, LLM-assisted CI/CD, automated incident triage) that makes the engineering organization more productive, including sandboxed environments where agents can write, test, and verify code.
- Foster a culture of continuous learning and improvement, encouraging constructive review and adaptation processes.
Your Experience & Qualifications:
- Kubernetes expertise, with a strong understanding of its core concepts and the ability to manage and maintain clusters.
- Expertise with modern cloud-native tools, e.g. ArgoCD for GitOps, Terraform/Crossplane for IaC, and the Grafana LGTM stack (Loki, Grafana, Tempo, Mimir) for observability.
- 3-5 years of experience in using Infrastructure as Code and tools for cloud provisioning (must).
- 3-5 years of practice in development and scripting in languages like Go, Python, or similar (must).
- Proficient in both written and spoken English, with exceptional communication abilities.
- Expertise with Linux environments, containerization, and cloud technologies.
- Comprehensive knowledge of production management concepts for distributed systems.
- 3-5 years in operational roles, overseeing production settings.
- AI fluency, including daily use of AI coding tools and the ability to build and deploy LLM-powered developer tooling and autonomous agents.
- Networking knowledge; bonus for service mesh experience, platform engineering, and cross-cloud networking.
- Familiarity with the Ethereum ecosystem, staking, and blockchain technologies (advantage).