THE ROLE
We are looking for a Senior AI Engineer to design, build, and ship AI-powered software across the full stack β from the agentic infrastructure that powers our robot operations, to the backend services that expose it, to the interfaces that operators and engineers use every day.
You will own AI features end-to-end: from system design through implementation, deployment, and production monitoring. You understand the failure modes of LLM-based systems β non-determinism, prompt injection, runaway tool-calling, token cost spirals β and you build guardrails that prevent them. You are equally comfortable writing agent orchestration logic, designing REST APIs, and shipping a React dashboard.
This is not a research role. We want engineers who have closed the loop from prototype to production with AI systems that real users depend on.
RESPONSIBILITIES
AI Agent Development
- Design, implement, and deploy production-grade AI agents: multi-step reasoning pipelines, tool-calling workflows, multi-agent coordination, and human-in-the-loop handoffs
- Design and build agent harnesses β the runtime infrastructure (context management, tool definitions, memory, feedback loops, observability, and lifecycle control) that makes agents reliable in production; the model is a component, the harness is the product
- Engineer context pipelines: dynamic retrieval, re-ranking, semantic search, and GraphRAG as tools within an agentic reasoning loop β not static RAG pipelines; understand when to retrieve, when to use long context, and when to use agent memory
- Implement production-grade reliability: retry logic with backoff, cost controls, structured output validation, sandboxed tool execution, and checkpoint-resume for long-running agent workflows
- Develop systematic evaluation frameworks (evals, golden datasets, regression suites, observability traces) that measure agent quality and catch regressions before production
Backend & Infrastructure
- Architect and implement scalable backend services and APIs (REST/GraphQL) in Go, Rust, or TypeScript/Node.js
- Build and maintain integrations with external systems β databases, internal APIs, robot data streams β enabling agents to take real actions with appropriate access controls
- Own deployment, monitoring, and observability: Docker, Kubernetes, CI/CD pipelines, and LLM-specific tracing and cost tracking
Frontend & Product
- Build clean, functional web interfaces in React/Next.js β operator dashboards for robot fleet management, engineering tooling for the AI team, and customer-facing applications
- Own features end-to-end: product requirements, implementation, testing, rollout, and ongoing maintenance
- Treat prompt engineering as a first-class engineering discipline: write, test, and version prompts with the same rigor as application code
MINIMUM QUALIFICATIONS
Software Engineering Foundation
- 5+ years of professional software engineering experience with a full-stack production track record β this is a software engineering role first; strong fundamentals in system design, data structures, algorithms, and code quality are required
- Strong command of Python and/or TypeScript at a production level: clean abstractions, testable code, performance awareness, and maintainability β not just scripting
- Backend engineering depth: Go, Rust, or TypeScript/Node.js for production services β RESTful and GraphQL API design, relational database modeling (PostgreSQL), async programming, caching, and system integration via APIs and webhooks; Python for AI/ML integration and scripting
- Frontend engineering proficiency: React, Next.js, TypeScript β able to architect and ship functional, production-grade UIs, not just wire up component libraries
- Software delivery practices: automated testing (unit, integration, end-to-end), CI/CD pipelines, code review, and observability (logging, metrics, alerting)
- Containerization and deployment: Docker, Kubernetes β able to own a service from code to production without a DevOps handoff
AI & Agent Engineering
- Proven, hands-on experience building and deploying LLM-powered systems or AI agents in production β beyond prototypes; you understand the real failure modes (non-determinism, prompt injection, tool-calling loops, cost spirals)
- Experience with at least one LLM API (Anthropic Claude, OpenAI, or equivalent) and agentic frameworks (LangChain, LangGraph, PydanticAI, or similar)
- Ability to design agent architectures with appropriate guardrails: structured output validation, retry logic, fallback handling, and human-in-the-loop patterns
PREFERRED QUALIFICATIONS
- Familiarity with harness engineering patterns: AGENTS.md structured repositories, architectural constraint enforcement via linters, observability-driven agent iteration, and agent-first documentation as living systems β not static docs
- Understanding of context engineering beyond naive RAG: agentic retrieval, GraphRAG, hybrid search, semantic layers, and when long context windows are a better fit than retrieval
- Experience with durable execution patterns (Temporal, or similar) for long-running or stateful agent workflows with checkpoint-resume
- Vector database and embedding experience (Pinecone, Weaviate, pgvector, Voyage AI, etc.) β but as one tool in a broader context engineering stack, not the whole solution
- Background in robotics, industrial automation, or IoT β experience building software that connects to physical hardware or real-time data streams
- Experience designing multi-tenant platforms or internal developer platforms (SDKs, golden-path tooling, shared infrastructure)
- Familiarity with prompt injection risks, sandboxed code execution, and AI security considerations for agents that take real-world actions
- Active use of AI coding agents (Claude Code, Codex, Gemini, or equivalent) as a core part of your development workflow β you know how to get 10x leverage from them without shipping broken code