About Nscale
Nscale is the GPU cloud engineered for AI. We provide cost-effective, high-performance infrastructure for AI start-ups and large enterprise customers. Nscale enables AI-focused companies to achieve superior results by reducing the complexity of AI development. Our GPU cloud bolsters technical capabilities and directly supports strategic business outcomes, including cost management, rapid innovation, and environmental responsibility.
We thrive on a culture of relentless innovation, ownership, and accountability, where every team member takes pride in their work and drives it with excellence and urgency. As an Nscaler, youβll build trust through openness and transparency, where everyone is inspired to do their best work. If you join our team, youβll be contributing to building the technology that powers the future.
About the Role
Weβre hiring an Infrastructure Engineer (OpenStack Ironic Specialist) to design, operate, and continuously improve the bare metal provisioning platforms that underpin Nscaleβs infrastructure.
This role sits within the Infrastructure Engineering team, which is responsible for the design, implementation, operation, and ongoing improvement of the infrastructure stack supporting both internal and customer-facing services. Youβll work closely with network, compute, data centre, support, and pre-sales teams, while also serving as a specialist escalation point for advanced provisioning and hardware issues.
This is a high-impact role focused on OpenStack Ironic, automated hardware lifecycle management, and the reliable operation of large-scale physical infrastructure. Youβll also help connect Nscale to the broader upstream OpenStack community, ensuring our bare metal platforms evolve in line with real operational needs and industry direction.
What you'll be doing
Bare Metal Provisioning & Lifecycle Management
β’ Design scalable and resilient bare metal provisioning platforms with a strong focus on OpenStack Ironic.
β’ Own the full lifecycle of physical infrastructure, including discovery, enrolment, provisioning, cleaning, deprovisioning, and hardware state management.
β’ Build and maintain provisioning workflows for a wide range of hardware profiles, including GPU-enabled and high-performance server platforms.
β’ Support platform upgrades, lifecycle management, and operational improvements across Ironic and its dependencies.
Automation & Platform Integration
β’ Manage and improve integrations between Ironic and related OpenStack services such as Nova, Neutron, Glance, Keystone, and Placement.
β’ Drive automation for hardware onboarding, firmware and BIOS configuration, deployment workflows, validation, and recovery.
β’ Implement infrastructure automation using infrastructure-as-code and configuration management approaches.
β’ Ensure provisioning platforms and operational processes align with security, compliance, and operational standards.
Troubleshooting, Reliability & Operational Support
β’ Troubleshoot complex issues across provisioning pipelines, PXE/iPXE, BMC interfaces, out-of-band management, image deployment, network boot, and hardware compatibility.
β’ Act as a 3rd/4th line escalation point for advanced bare metal and provisioning incidents.
β’ Perform root cause analysis and implement long-term fixes to improve platform reliability and repeatability.
β’ Participate in on-call rotations and incident response activities for critical infrastructure services.
Cross-Functional Collaboration & Community Engagement
β’ Collaborate with network, compute, data centre, and support teams to deliver reliable physical infrastructure services.
β’ Contribute specialist input to infrastructure roadmap planning, capacity expansion, standard builds, and hardware platform qualification.
β’ Support pre-sales and solution design efforts with expert guidance on bare metal capabilities, operational models, and deployment constraints.
β’ Contribute to upstream OpenStack bare metal communities through bug reports, testing, reviews, design discussions, and code contributions where appropriate.
β’ Track upstream roadmaps and release changes to help shape Nscaleβs bare metal strategy, upgrade planning, and platform standards.
KPIs
β’ Automated provisioning and hardware onboarding reliability
β’ Bare metal incident resolution and root cause closure
β’ Platform upgrade and lifecycle delivery across Ironic dependencies
β’ Upstream OpenStack Ironic community contribution and adoption alignment
About You
β’ Strong experience operating Linux systems and troubleshooting production infrastructure
β’ Strong specialist knowledge of OpenStack Ironic and the surrounding provisioning ecosystem
β’ Strong understanding of bare metal provisioning concepts including PXE/iPXE, DHCP, TFTP/HTTP boot, BMC technologies, RAID configuration, firmware management, disk imaging, and node lifecycle states
β’ Strong experience with out-of-band management technologies such as Redfish, IPMI, or vendor management interfaces
β’ Strong experience designing and building automation for physical and virtual infrastructure using tools such as Ansible
β’ Strong scripting skills in Python and Bash
β’ Experience troubleshooting complex provisioning and hardware integration issues across server, network, and management layers
β’ Experience operating infrastructure at scale with a focus on reliability, repeatability, and operational safety
β’ Ability to collaborate across infrastructure, support, and architecture teams to solve complex technical problems
β’ Experience contributing to or working closely with upstream open-source communities, particularly OpenStack, Ironic, Metal3, or related infrastructure projects, is highly desirable
What we can offer you
At Nscale, you'll find a collaborative, supportive, and innovative environment where your contributions spark real impact. We're building something extraordinary, and we want you at the core.
Highly competitive US compensation package (base + bonus + equity), with performance reviews every 12 months. π
Join one of the fastest-growing AI infrastructure companies β your chance to directly shape how global AI capacity is planned and deployed. β¨
Expect a dynamic progression plan tailored to your ambitions. Grow by leading critical cross-functional initiatives and shaping capital strategy β always with our full support.
Human-First Flexibility: We treat you as humans first. π«Άπ½ Our flexible workplace trusts Nscalers to deliver, giving you the autonomy to shape your day around life's moments.
Equal Opportunities Statement
We strongly encourage applications from people of colour, the LGBTQ+ community, people with disabilities, neurodivergent people, parents, carers, and people from lower socio-economic backgrounds.
If thereβs anything we can do to accommodate your specific situation, please let us know.
The responsibilities outlined in this job description are not exhaustive and are intended to provide a general overview of the position. The employee may be required to perform additional duties, tasks, and responsibilities as assigned by management, consistent with the skills and qualifications required for the role.
For information on how Nscale handles candidate personal data, please see our Employee & Candidate Privacy Notice: Here.
For information on how Nscale handles candidate personal data, please see our Employee & Candidate Privacy Notice: Here.
Salary Range
The range below reflects the base salary for the position. Actual compensation may vary based on job-related factors such as skill set, experience, education, and location. In addition to base salary, this role may be eligible for bonus, equity, and/or commission programs. Nscale may offer a competitive benefits package including medical, dental, vision, flexible paid time off, parental leave, and retirement plan participation.
For information on how Nscale handles candidate personal data, please see our Employee & Candidate Privacy Notice: Here.