As a Senior Platform Engineer, you play a critical role in maintaining and evolving our enterprise integration platforms. This role blends engineering and operations, ensuring our platforms are not only reliable today but improved for tomorrow. You will focus heavily on proactive engineering work β such as automating platform operations, improving observability, designing for resilience, and optimizing performance β while also owning key operational responsibilities like lifecycle management, stability assurance, and deepβdive troubleshooting. Through both preventative work and targeted operational excellence, you will ensure our Kafka, Solace PubSub+, and API Management platforms remain trusted, resilient, and ready for future growth." 
 
Key Responsibilities:
Platform Architecture & Infrastructure Leadership 
β’ Lead the implementation, and optimization of complex platform infrastructure, ensuring high availability, fault tolerance, and compliance with architectural standards. 
β’ Maintain scalable platform ecosystems using advanced cloud capabilities including container orchestration (e.g., Kubernetes), serverless computing, and distributed systems. 
β’ Drive continuous evolution of integration platforms to meet emerging business and technology needs. Integration Platform Operations & Reliability (Kafka, Solace PubSub+, API Management) 
β’ Operate, maintain, and enhance enterprise-grade integration platforms, including Apache Kafka, Solace PubSub+, and API management platforms (e.g., Azure API Management). 
β’ Lead platform lifecycle management for at least one of these technologies, covering upgrades, scaling, patching, performance optimization, HA configuration, and automation. 
β’ Own production reliability for integration platforms, ensuring stability, observability, and proactive risk mitigation. 
β’ Troubleshoot complex platform issues, oversee incident resolution, and drive long-term corrective actions. Cloud Engineering & Automation 
β’ Implement advanced cloud services and Infrastructure-as-Code (IaC) using technologies such as ARM, Bicep, Terraform, and CI/CD automation. 
β’ Automate large-scale operational processes to reduce manual work, improve reliability, and enforce consistency. 
β’ Identify and implement new cloud-based technologies, optimizing platform performance and resilience. Platform Operations, Observability & Reliability Engineering 
β’ Ensure the stability, health, and performance of enterprise integration platforms across multiple environments. 
β’ Design end-to-end observability through dashboards, monitoring, alerts, and tracing frameworks (OpenTelemetry, Grafana, Splunk, Azure Monitor). 
β’ Lead complex incident and problem management processes, including root cause analysis and long-term corrective action. Security, Risk & Compliance 
β’ Define and enforce platform security best practices across the full platform lifecycle. 
β’ Implement vulnerability management programs, secure configuration baselines, and continuous compliance monitoring. 
β’ Lead risk assessments, security reviews, and ensure compliance with organizational and regulatory standards. Performance Optimization & High Availability 
β’ Fine-tune platform performance including compute, networking, data flow, database interactions, and capacity planning. 
β’ Design and maintain high-availability architectures and advanced failover strategies. 
β’ Lead Disaster Recovery planning, execution, validation, and continuous improvement. Technical Leadership & Collaboration 
β’ Mentor junior and mid-level engineers, providing technical guidance. 
β’ Act as a cross-team technical integrator, collaborating with platform developers, architects, security teams, and other stakeholders. 
β’ Influence platform standards, engineering best practices, and long-term platform strategy.  
Required Technical Skills:
β’ 7+ years in platform engineering, SRE, DevOps, or cloud infrastructure roles. 
β’ Mandatory hands-on experience with at least one enterprise integration platform: Apache Kafka, Solace PubSub+, or API Management (Azure API Management preferred). 
β’ Expert knowledge of cloud platform services (Azure preferred) including containers, serverless, identity, networking, and advanced workload design. 
β’ Strong background in automation, scripting (Python, Bash, PowerShell, or similar), and IaC. 
β’ In-depth knowledge of cloud security, compliance, and governance frameworks. 
β’ Proven experience leading complex platform initiatives and mentoring engineering teams. 
β’ Hands-on experience with observability, reliability engineering, and performance tuning.  
Key Soft Skills
β’ Communicates clearly and effectively across technical and nonβtechnical audiences. 
β’ Demonstrates leadership, ownership, adaptability, and strong analytical skills. 
β’ Embraces innovation, continuous learning, and improvement. 
β’ Acts collaboratively and fosters a strong engineering culture.