Join us in building the future of finance. Our mission is to democratize finance for all. An estimated $124 trillion of assets will be inherited by younger generations in the next two decades. The largest transfer of wealth in human history. If you’re ready to be at the epicenter of this historic cultural and financial shift, keep reading. About the team + role We are building an elite team, applying frontier technologies to the world’s biggest financial problems. We’re looking for bold thinkers. Sharp problem-solvers. Builders who are wired to make an impact. Robinhood isn’t a place for complacency, it’s where ambitious people do the best work of their careers. We’re a high-performing, fast-moving team with ethics at the center of everything we do. Expectations are high, and so are the rewards. The Agentic AI team builds agentic AI systems that power intelligent, reliable customer experiences across Robinhood products. The team focuses on reducing the time to ship agents with fine-tuned models and while doing so enables other teams to build, evaluate, and improve their own agents. You will contribute to a culture grounded in first-principles thinking, high performance, and strong focus on customer outcomes! As a Staff Machine Learning Engineer (IC6), you will define and uphold the quality bar for agentic systems across the organization. You will design evaluation frameworks, guide model selection, and partner with product, data science, and engineering teams to ensure systems meet clear standards for correctness, safety, latency, and user satisfaction. Your work will shape how agentic systems are built, evaluated, and improved across Robinhood! This role is based in our Bellevue, WA or Menlo Park, CA office, with in-person attendance expected at least 3 days per week. At Robinhood, we believe in the power of in-person work to accelerate progress, spark innovation, and strengthen community. Our office experience is intentional, energizing, and designed to fully support high-performing teams. What you’ll do ● Define and implement evaluation frameworks that measure agent performance, including task success, correctness, tool usage reliability, latency, safety, and user satisfaction ● Evaluate frontier and fine-tuned models across quality, latency, cost, and edge cases to determine appropriate use cases ● Partner with product managers, data scientists, and engineers to translate evaluation results into clear launch criteria for agentic systems ● Analyze production issues, identify root causes, and prioritize improvements to increase system reliability and performance ● Build visibility into agent performance through metrics, monitoring, and reporting that inform roadmap decisions What you bring ● You have deep experience defining and measuring quality for agentic or machine learning systems using evaluation frameworks, datasets, and scorecards ● You have experience evaluating large language models or similar systems, including understanding tradeoffs in performance, cost, and latency ● You have demonstrated ability to analyze production issues and lead initiatives that improve system quality across multiple teams ● You are comfortable working with engineers, data scientists, and product partners to deliver measurable improvements in system performance ● You have experience building or operating systems in regulated environments or working with AI evaluation and observability tools (nice to have) What we offer ● Challenging, high-impact work to grow your career ● Performance driven compensation with multipliers for outsized impact, bonus programs, equity ownership, and 401(k) matching ● Best in class benefits to fuel your work, including 100% paid health insurance for employees with 90% coverage for dependents ● Lifestyle wallet - a highly flexible benefits spending account for wellness, learning, and more ● Employer-paid life & disability insurance, fertility benefits, and mental health benefits ● Time off to recharge including company holidays, paid time off, sick time, parental leave, and more! ● Exceptional office experience with catered meals, events, and comfortable workspaces In addition to the base pay range listed below, this role is also eligible for bonus opportunities + equity + benefits. Base pay for the successful applicant will depend on a variety of job-related factors, which may include education, training, experience, location, business needs, or market demands. The expected base pay range for this role is based on the location where the work will be performed and is aligned to one of 3 compensation zones. For other locations not listed, compensation can be discussed with your recruiter during the interview process. Base Pay Range: Zone 1 (Menlo Park, CA; New York, NY; Bellevue, WA; Washington, DC)$255,000—$300,000 USDZone 2 (Denver, CO; Westlake, TX; Chicago, IL)$225,000—$264,000 USDZone 3 (Lake Mary, FL; Clearwater, FL; Gainesville, FL)$199,000—$234,000 USDClick here to learn more about our Total Rewards, which vary by region and entity. If our mission energizes you and you’re ready to build the future of finance, we look forward to seeing your application. Robinhood provides equal opportunity for all applicants, offers reasonable accommodations upon request, and complies with applicable equal employment and privacy laws. Inclusion is built into how we hire and work—welcoming different backgrounds, perspectives, and experiences so everyone can do their best. Please review the Privacy Policy for your country of application.

Staff Machine Learning Engineer, Agentic at Robinhood

Similar Engineering Jobs

Senior Data Engineer AI

Senior Staff Machine Learning Engineer (Coupang AI Foundations)

DevOps Engineer - Latam

Share this job

About Robinhood

Senior QA Engineer I

Engineering Manager - RevTech (m/f/d)

Forward Deployed Engineer

Translation Jobs

Popular Skills

Jobs by Salary

For Job Seekers

For Employers