Grid Dynamics is looking for a Network Operations Center Engineer to join a team within a large-scale retail tech environment. The team is responsible for monitoring, alerting and operational health of infrastructure and applications supporting thousands of stores and millions of online transactions globally.

Responsibilities:

  • Monitor infrastructure, applications and cloud services
  • Triage and escalate incidents
  • Contribute to infrastructure provisioning using Terraform
  • Build and maintain Grafana dashboards and alerts
  • Configure alerting in New Relic and PagerDuty
  • Maintain and improve runbooks and operational procedures

Requirements — Must-Have:

  • Hands-on experience with Terraform for infrastructure provisioning and IaC
  • Practical experience with Grafana — dashboards, alerting and monitoring
  • Experience with New Relic for APM, observability, and incident investigation
  • Experience with PagerDuty for on-call alerting and incident management
  • Scripting experience in Bash for operational tasks and automation

Nice-to-Have:

  • Python and/or Java scripting experience
  • Working knowledge of SQL for troubleshooting and data analysis
  • Experience with Splunk for log aggregation and alerting
  • Familiarity with Quantum Metric, GCP Cloud Monitoring, Atlassian Bamboo or Functionize

We offer:

  • Opportunity to work on bleeding-edge projects
  • Work with a highly motivated and dedicated team
  • Competitive salary
  • Flexible schedule
  • Benefits package — medical insurance, sports
  • Corporate social events
  • Professional development opportunities
  • Well-equipped office