We are a leading trading platform that is ambitiously expanding to the four corners of the globe. Our top-rated products have won prestigious industry awards for their cutting-edge technology and seamless client experience. We deliver only the best, so we are always in search of the best people to join our ever-growing team. The Head of SRE and Infrastructure will play a critical role in shaping the reliability, scalability, and resilience of our infrastructure as we continue to grow globally. This is a senior technology leadership role - you will be responsible for an organization of approximately 40 individuals spread across SRE, DevOps, DBA, developer experience and technical support teams. You will own the development and execution of our SRE and infrastructure strategy, and the build-out of reliable high-load systems. This role combines strategic leadership with deep technical understanding of modern cloud infrastructure, DevOps practices, observability, and operational excellence. What you will do: • Leadership and Strategy: Develop and execute the SRE and infrastructure strategy to support the organisation’s technology roadmap, product growth, and global expansion. Lead the continued evolution of existing DevOps and infrastructure capabilities into a mature SRE framework with documented SLOs, error budgets, and operating standards adopted across every engineering tribe. • Cloud Infrastructure and Automation: Oversee the design, automation, and optimisation of our cloud infrastructure (AWS, Kubernetes/EKS, Terraform, Helm, infrastructure-as-code). Drive the migration of remaining on-premise workloads into the cloud and the build-out of a multi-cloud disaster recovery footprint with backup on on-premise servers. • GitOps and Continuous Delivery Platform: Own the GitOps platform end-to-end. Consolidate the existing FluxCD estate, evaluate and execute the move to ArgoCD with progressive / canary delivery, and ensure secrets, image signing, environment promotion, and policy enforcement are uniform across all tribes. • Platform Reliability and Resilience: Build and maintain a reliable, scalable platform for regulated, multi-jurisdiction trading. Define and enforce reliability standards (SLIs, SLOs, SLAs, error budgets). Own the firm-wide disaster recovery strategy, including recovery-site selection, RTO/RPO targets per service tier, regular DR drills with business and risk stakeholders, and the playbooks that turn DR from theory into a tested capability. • Monitoring and Observability: Define and operate a single observability standard (metrics, logs, traces) that every engineering team consumes - including SLO instrumentation, golden signals, alerting hygiene, and on-call ergonomics. Make observability a product, not a side-effect of deployment. • Incident Management and Continuous Improvement: Work closely with incident, problem management, engineering, and operations teams to improve incident response, post-incident analysis, and long-term prevention with clear escalation criteria, P0/P1 acknowledgement SLAs, change-quality gates inside CI/CD pipelines, and DR readiness with clear DORA metrics. Drive a learning culture that turns recurring incident themes into systemic prevention. • Team Leadership and Development: Lead, hire, and develop SRE, DevOps, DBA, developer experience and technical support teams. Foster a strong engineering culture based on accountability, ownership, technical excellence, and continuous improvement. • Cross-functional Collaboration: Partner with development, security, compliance, risk, release, and business teams to ensure infrastructure and reliability priorities are aligned with product delivery, client experience, and regulatory obligations across all our operating jurisdictions. What you’ll bring to the role: • Demonstrated experience as a Head of SRE, SRE Director, Infrastructure Director, Engineering Director, or similar senior leadership role in a major technology, fintech, or financial services company, or equivalent experience with high-load platform environments (low latency, high throughput, in-memory systems). • A strong background in SRE, DevOps, infrastructure engineering, cloud platforms, and operating complex, high-availability systems. • Hands-on technical understanding of modern infrastructure technologies, including AWS, Kubernetes, Terraform, FluxCD/ArgoCD, CI/CD tools, monitoring and alerting systems, and infrastructure-as-code practices. • Deep understanding of SRE principles, including SLOs, SLIs, SLAs, error budgets, incident management, observability, automation, and resilience engineering. • Experience as a manager of managers, with the ability to inspire people and hold them to account in equal measure, and in hiring, mentoring, performance management, and building strong engineering culture. • A proven ability to work collaboratively with various teams and to adeptly discuss technical details with engineering teams as well as translate these details into actionable language for non-technical stakeholders. • Strong analytical skills and the ability to use metrics and analytics to guide technical decisions and improvements. • A pragmatic approach and the ability to prioritize outcomes over process when necessary to drive effective and actionable results. What you will get in return: Competitive Salary: We believe great work deserves great pay! Your skills and talents will be rewarded with a salary that makes you feel valued and motivated. Work-Life Harmony: Join a company that genuinely cares about you - because your life outside of work matters just as much as your time on the clock. #LI-Hybrid Generous Time Off: Need a breather? Our annual leave policy lets you recharge and enjoy life outside of work without a worry. Employee Referral Program: Love working here? Share the love! Bring your talented friends on board and get rewarded for growing our awesome team. Comprehensive Health & Pension Benefits: From medical insurance to pension plans, we’ve got your back. Plus, location-specific benefits and perks! Workation Wonderland: Live your digital nomad dreams with 30 extra days to work remotely from anywhere in the world (some restrictions apply). Adventure awaits! Volunteer Days: Make a difference! Take two additional paid days each year to support causes you care about and give back to the community. Be a key player at the forefront of the digital assets movement, propelling your career to new heights! Join a dynamic and rapidly expanding company that values and rewards talent, initiative, and creativity. Work alongside one of the most brilliant teams in the industry. Our company has an Internal Reporting Procedure. It is available from the Human Resources Department upon request [Upgrade to PRO to see contact]. You may report a violation referred to in the Procedure under the terms specified therein.

Head of SRE and Infrastructure at capital

Similar Engineering Jobs

Software Development Engineer in Test

Technical Support Engineer, Sec-AI

Middle DevOps Engineer

Share this job

About capital

Engineering Tech Lead (vNode)

Engineering Manager, Save & Invest

Software Engineer, Machine Learning Platform

Translation Jobs

Popular Skills

Jobs by Salary

For Job Seekers

For Employers