Our Purpose
At Fiddler, we understand the implications of AI and the impact that it has on human lives. Our company was born with the mission of building trust into AI. The rise of Generative AI and Agents has unlocked generalized intelligence but also widened the risk aperture and made it harder to ensure that AI applications are working well. Fiddler enables organizations to get ahead of these issues by helping deploy trustworthy, and transparent AI solutions.Β
Fiddler partners with AI-first organizations to help build a long-term framework for responsible AI practices, which, in turn, builds trust with their user base. AI Engineers, Data Science, and business teams use Fiddler AI to monitor, evaluate, secure, analyze, and improve their AI solutions to drive better outcomes. Our platform enables engineering teams and business stakeholders alike to understand the "what", βwhyβ, and "how" behind AI outcomes.Β Β
Our Founders
Fiddler AI is founded by Krishna Gade (engineering leader at Facebook, Pinterest, Twitter, and Microsoft) and Amit Paka (product leader at Microsoft, Samsung, Paypal and two-time founder). We are backed by Insight Partners, Lightspeed Venture Partners, and Lux Capital.Β
Why Join Us
Our team is motivated to help build trust into AI to enable society harness the power of AI. Joining us means you get to make an impact by ensuring that AI applications at production scale across industries have operational transparency and security.Β Β We are an early-stage startup and have a rapidly growing team of intelligent and empathetic doers, thinkers, creators, builders, and everyone in between. The AI and ML industry has a rapid pace of innovation and the learning opportunities here are monumental. This is your chance to be a trailblazer.Β Β
Fiddler is recognized as a pioneer in the field of AI Observability and has received numerous accolades, including:Β 2022 a16z Data50 list, 2021 CB Insights AI 100 most promising startups, 2020 WEF Technology Pioneer, 2020 Forbes AI 50 most promising startups of 2020, and a 2019 Gartner Cool Vendor in Enterprise AI Governance and Ethical Response. By joining our brilliant (at least we think so) team, you will help pave the way in the AI Observability space.
ABOUT THE TEAM
Our Platform Engineering team is a group of builders who make hard things look easy. They design and maintain the systems that keep Fiddler running smoothly; quietly holding the universe together with code, caffeine, and collaboration.
Spread across time zones but united in purpose, this crew thrives on solving real problems, sharing what they learn, and jumping in to help wherever theyβre needed. They bring brains, grit, and a healthy sense of humor to everything they do. Theyβre the kind of teammates who fix the issue, explain how they did it, and still show up to celebrate the win with everyone else. Platform Engineering doesnβt just keep the lights on, weβre busy designing better ones.
If you like working on problems that havenβt been solved before, youβll thrive here. Fiddlerβs backend is where GenAI observability becomes real, from designing authorization layers for enterprise-scale AI systems to pioneering MCP authorization patterns. The systems you build will directly determine how companies trust and monitor their AI in production.
The Platform Engineering team owns the foundational data infrastructure that every Fiddler service depends on: event ingestion, streaming pipelines, time-series storage, aggregation, and the query layers that power monitoring and analytics. We operate at the intersection of streaming data, OLAP analytics, and multi-tenant SaaS infrastructure.
The stack: Python, Kafka, ClickHouse, PostgreSQL, Redis, Celery, Kubernetes (AWS/GCP), with Ray for ML inference workloads.
WHAT YOU'LL DO
- Set technical direction for the data platform β Own the architecture roadmap for Fiddler's ingestion, storage, and query layers. Drive multi-quarter initiatives from problem framing through design, implementation, and rollout.
- Design systems for 10x scale β Lead the evolution of our ClickHouse-backed analytics layer and Kafka-based ingestion pipeline to handle order-of-magnitude growth in event volume, query complexity, and tenant count.
- Define the event model for next-generation AI workloads β Architect the data model and storage strategy for agentic application traces, LLM evaluation pipelines, and enrichment workflows β balancing flexibility, query performance, and schema evolution.
- Drive cross-team technical decisions β Partner with the Backend, Monitoring, and Enrichment teams to ensure platform abstractions serve their needs. Represent the Platform perspective in company-wide architecture reviews.
- Own platform reliability and cost efficiency β Establish SLOs, capacity planning processes, and cost optimization strategies for data infrastructure. Make build-vs-buy decisions for infrastructure components.
- Raise the engineering bar β Mentor senior engineers. Establish patterns and guardrails (data modeling conventions, query optimization practices, testing strategies) that compound across the team. Lead by example in code review, design docs, and incident response.
- Influence product direction β Work with Product and Customer Engineering to translate customer data challenges into platform capabilities. Help define what's feasible, what's risky, and what to build next.
WHAT WE'RE LOOKING FOR
- 10+ years of experience building and operating production data platforms or large-scale distributed systems, with demonstrated technical leadership
- Track record of leading multi-quarter technical initiatives end-to-end β from problem definition through architecture, execution, and measurable outcome
- Deep expertise in at least two of: streaming systems (Kafka, Flink, Pulsar), OLAP databases (ClickHouse, Druid, Pinot), relational databases (PostgreSQL), distributed task systems (Celery, Temporal)
- Strong proficiency in Python; ability to make language and framework tradeoff decisions
- Experience designing multi-tenant data architectures with strong isolation, security, and performance guarantees
- Production operations maturity: incident leadership, SLO definition, capacity planning, cost optimization
- Ability to communicate architectural decisions clearly to both engineering peers and non-technical stakeholders
- BS/MS/PhD in Computer Science or equivalent depth of experience
NICE TO HAVE
- Experience building observability, monitoring, or analytics platforms (the domain we operate in)
- Familiarity with ML infrastructure: model serving pipelines, feature stores, inference engines (Ray, Triton)
- Experience with data lake / lakehouse architectures (Iceberg, Delta, Parquet)
- History of establishing engineering standards, review processes, or developer tooling for a platform team
- Open-source maintainer-ship or significant contributions to data infrastructure projects
BENEFITS & PERKS
- Competitive pay + equity
- Premium health, dental & vision
- Retirement plan
- Generous PTO
- Annual health check
- Paid parental leave
- Team and company events and offsites (this year was Coorg Wilderness Resort)
- Lunch provided for in-office days
The posted range represents the expected salary range for this job requisition and does not include any other potential components of the compensation package and perks previously outlined. Ultimately, in determining pay, we'll consider your experience, leveling, location, and other job-related factors.
Fiddler is proud to be an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees. If you require special accommodations in order to complete the interviews or perform job duties, please inform the recruiter at the beginning of the process.
Beware of job scam fraud. Our recruiters use @fiddler.ai email addresses exclusively. In the US, we do not conduct interviews via text or instant message, or ask for sensitive personal information such as bank account or social security numbers.