Why Cast AI?
Cast AI is an automation platform that operates cloud-native and AI infrastructure at scale. By embedding autonomous decision-making directly into Kubernetes and cloud environments, Cast AI continuously optimizes performance, reliability, and efficiency in production.
The old way doesn't work. As Kubernetes and AI environments grow, manual decisions donβt. Cast AI replaces tickets, alerts, and manual tuning with continuous automation that adapts infrastructure as conditions change. Efficiency and cost savings follow naturally from that automation.
Over 2,100 companies already rely on Cast AI, including Akamai, BMW, Cisco, FICO, HuggingFace, NielsenIQ, Swisscom, and TGS.
Global team, diverse perspectives
We're headquartered in Miami, but our impact is international. We take a global and intentional approach to diversity. Today, Cast AI operates across 34 countries spanning Europe, North America, Latin America, and APAC, bringing a wide range of perspectives into how we build and lead.
Unicorn momentum
In January 2026, we achieved unicorn status with a strategic investment from Pacific Alliance Ventures, the corporate venture arm of Shinsegae Group (a $50+ billion Korean conglomerate). Our valuation now exceeds $1 billion, and we're just getting started.
Join us as we build the future of autonomous infrastructure.
This is a location-specific opportunity. We are currently accepting applications from candidates residing in the following European countries: Bulgaria, Croatia, Estonia, Greece, Hungary, Latvia, Lithuania, Poland, Romania, Slovakia, Slovenia, and Ukraine.
We are hiring across multiple teams!
As a Senior Software Engineer, you will have the opportunity to work on different key features of our product. We are currently hiring Senior Software Engineers for the following teams:
- Workload Optimization - Automates workload resource management by dynamically adjusting resource allocations, helping developers significantly reduce costs and improve application reliability.
- Karpenter - The Karpenter team powers the integration between Karpenter and Cast AI, bringing enterprise capabilities to the most popular open source Kubernetes autoscaler. We enhance Karpenter with advanced features that improve application reliability and performance while optimizing costs. By joining the team, youβll bridge open source innovation with enterprise requirements, directly impacting how organizations run Karpenter at scale.
- Reporting - Builds a scalable reporting system that ingests millions of rows per second into our time-series databases, providing insights into cost savings, workload efficiencies, and Cast AI automation impact.
- Pricing - Drives the synchronization of public and customer cloud resources, availability, and dynamic pricing across all major cloud providers. Empowers autoscaling by leveraging discounts, commitments, and cross-cluster tracking to maximize savings. Provides a reliable source of truth for node pricing, resources, components, discounts, and commitments.
- Autoscaler - Automates Kubernetes node autoscaling to optimize clusters, balance workloads, remove underutilized nodes, and dynamically allocate capacity in real-time, thereby reducing cluster costs by half.
- Identity - Builds and maintains the trust and access foundation for the entire platform, ensuring every user, service, and workload authenticates and interacts securely and seamlessly at scale.
- Billy - Powers the critical day-2 operations layer of the platform - from billing and audit trails to notifications and feature flags - ensuring the platform runs reliably, transparently, and at scale for every customer, every day.
Here are some of the tools we use daily:
β’ Programming Languages: Go
β’ Cloud & Orchestration: Kubernetes, AWS, GCP, Azure
β’ Databases & Storage: PostgreSQL, Cloud Object Storage
β’ Messaging & APIs: GCP Pub/Sub, gRPC for internal communication, REST for public APIs
β’ Observability: Prometheus, Grafana, Loki, Tempo
β’ CI/CD & GitOps: GitLab CI with ArgoCD.
Requirements:
β’ Production experience with Go is strongly preferred; candidates without Go should demonstrate strong systems programming skills in a comparable language.
β’ Deep hands-on experience with cloud platforms (AWS, GCP, or Azure) - including real understanding of how compute, networking, and storage work under the hood.
β’ Understanding of Kubernetes internals - autoscaling and networking.
β’ You've personally driven a complex project end-to-end.
β’ Strong debugging, optimization, and performance-tuning skills - including query profiling, index design, and database performance tuning beyond ORM usage.
β’ You've run observability tooling (Prometheus, Grafana, OpenTelemetry) in production.
β’ CI/CD and DevOps practices experience.
β’ Strong English skills, both verbal and written.
β’ Startup mindset: adaptable, proactive, and comfortable with ambiguity.
Responsibilities:
β’ Design and build distributed systems that operate Kubernetes infrastructure autonomously at scale.
β’ Write production Go services that interact with AWS, GCP, and Azure APIs for real-time cloud resource management.
β’ Own features end-to-end: from design through implementation, testing, and production rollout (most projects ship in 1-4 weeks).
β’ Debug complex production issues across cloud providers, Kubernetes clusters, and distributed services.
β’ Collaborate with product and other engineering teams to solve problems that don't have textbook solutions.
β’ Work with time-series data, cloud provider APIs, and Kubernetes control plane internals.
Whatβs in it for you?
β’ Competitive salary (β¬6,500 - β¬9,000 gross, depending on the level of experience)
β’ Enjoy a flexible, remote-first global environment.
β’ Collaborate with a global team of cloud experts and innovators, passionate about pushing the boundaries of Kubernetes technology.
β’ Equity options.
β’ Get quick feedback with a fast-paced workflow. Most feature projects are completed in 1 to 4 weeks.
β’ Spend 10% of your work time on personal projects or self-improvement.
β’ Learning budget for professional and personal development - including access to international conferences and courses that elevate your skills.
β’ Annual hackathon to spark new ideas and strengthen team bonds.
β’ Team-building budget and company events to connect with your colleagues.
β’ Equipment budget to ensure you have everything you need.
β’ Extra days off to help maintain a healthy work-life balance.
Hiring process
β’ Screening call with Recruiter
β’ Hiring Manager interview
β’ Technical interview (system design)
β’ Live coding
β’ Culture Check interview with an executive
*As part of our standard hiring process, we would like to inform you that a background check may be conducted at the final stage of recruitment through our third-party provider, Checkr.
*Please note that Cast AI does not provide any form of visa sponsorship/work permit.
#LI-Remote