1. SRE Lead Tools Engineer
Location: Remote
Experience: 7β10 Years
Role Overview
We are looking for an experienced SRE Lead Tools Engineer with strong expertise in observability, reliability engineering, and monitoring tools like Splunk and Dynatrace.
Key Responsibilities
* Apply SRE principles including SLIs, SLOs, SLAs, error budgets, and incident management
* Work with SRE L2/L3 teams to improve reliability, reduce MTTR, and strengthen monitoring
* Act as the SME for Splunk and Dynatrace, recommending the right monitoring approach
* Design dashboards, alerts, log onboarding, and monitoring strategies
* Configure Dynatrace for APM, RUM, synthetic monitoring, and root cause analysis
* Implement observability solutions across Azure environments
* Work with Azure App Services, AKS, Functions, Log Analytics, and Application Insights
* Build automation scripts using Python, PowerShell, or Bash
* Integrate observability tools with platforms like ServiceNow and Jira
* Conduct workshops, training, and advisory sessions with engineering teams
Required Skills
* 5+ years in SRE, DevOps, or Observability Engineering
* Strong understanding of SRE principles and incident management
* Hands-on experience with Splunk (SPL, dashboards, alerts, log ingestion)
* Strong expertise in Dynatrace (APM, RUM, synthetic monitoring)
* Strong Microsoft Azure experience
* Scripting experience in Python / PowerShell / Bash
* Excellent analytical and stakeholder management skills
Preferred Skills
* Experience in retail / eCommerce environments
* Knowledge of microservices and distributed systems
* Experience with AKS, Docker, and containers
* Exposure to Prometheus, Grafana, or ELK
* Splunk / Dynatrace / Azure certifications
Send your profile at [Upgrade to PRO to see contact]