Grid Dynamics is looking for a Network Operations Center Engineer to join a team within a large-scale retail tech environment. The team is responsible for monitoring, alerting and operational health of infrastructure and applications supporting thousands of stores and millions of online transactions globally.
Responsibilities:
- Monitor infrastructure, applications and cloud services
- Triage and escalate incidents
- Contribute to infrastructure provisioning using Terraform
- Build and maintain Grafana dashboards and alerts
- Configure alerting in New Relic and PagerDuty
- Maintain and improve runbooks and operational procedures
Requirements — Must-Have:
- Hands-on experience with Terraform for infrastructure provisioning and IaC
- Practical experience with Grafana — dashboards, alerting and monitoring
- Experience with New Relic for APM, observability, and incident investigation
- Experience with PagerDuty for on-call alerting and incident management
- Scripting experience in Bash for operational tasks and automation
Nice-to-Have:
- Python and/or Java scripting experience
- Working knowledge of SQL for troubleshooting and data analysis
- Experience with Splunk for log aggregation and alerting
- Familiarity with Quantum Metric, GCP Cloud Monitoring, Atlassian Bamboo or Functionize
We offer:
- Opportunity to work on bleeding-edge projects
- Work with a highly motivated and dedicated team
- Competitive salary
- Flexible schedule
- Benefits package — medical insurance, sports
- Corporate social events
- Professional development opportunities
- Well-equipped office