Filevine is a Legal AI company delivering Legal Operating Intelligence for the future of legal work. Grounded in a singular system of truth, Filevine brings together data, documents, workflows, and teams into one unified platformβwhere modern legal work happens with clarity and consistency.
Powered by LOIS, the Legal Operating Intelligence System, Filevine connects context across every matter to transform legal operations from reactive to proactive. LOIS reads, understands, and reasons across your data to surface insight, automate complexity, and give professionals the clarity and confidence to see more, know more, and do more. Fueled by a team of exceptional collaborators and innovators, Filevineβs rapid growth has earned AI awards and recognition from Deloitte and Inc. as one of the most innovative and fastest-growing technology companies in the country.
Primary Duties and Responsibilities
β’
Strategy & Team Leadership: Directly manage and align the prioritization of DevOps, SRE, and DBRE infrastructure teams under a unified reliability strategy. Set team objectives, drive execution, and ensure resources are focused on the highest-impact business and reliability investments.
β’
Platform Reliability & Incident Prevention: Conduct ongoing risk assessments of Filevine's platform to identify and prioritize areas of greatest fragility and business focus. Use data from incident history, usage analytics, monitoring systems, and customer feedback to drive proactive hardening efforts and reduce unplanned downtime.
β’
Reliability Metrics & Reporting: Define and track key reliability indicators (uptime/availability, mean time to detect, mean time to resolve, incident frequency). Own the reporting apparatus that makes platform health visible and actionable for leadership and product teams.
β’
Status Page & Incident Communication: Manage the process for updating the status page (status.filevine.com) during reliability events. Define clear criteria for posting incidents according to established communication protocols, and ensure customers and internal stakeholders receive timely, accurate updates.
β’
Cross-Functional Alignment: Serve as the bridge between SRE, Product, Engineering, and customer-facing teams (Support, Sales, Partners) to ensure reliability priorities reflect real customer and business impact. Translate reliability trends and infrastructure health into actionable insights for non-technical stakeholders.
β’
Infrastructure & Tooling: Evaluate, implement, and manage the reliability and observability tech stack. Drive decisions on monitoring, alerting, test environments, and infrastructure tooling to ensure the platform scales reliably.
Team Enablement & Culture: Establish reliability standards, runbooks, and operational patterns that empower engineering teams to contribute to platform resilience. Build documentation and training to make reliability ownership a shared responsibility across the organization.
Knowledge and Skills
β’
5+ years of experience in SRE, DevOps, platform engineering, or reliability-focused product/program management in SaaS.
β’
Software Engineering Background: Prior hands-on experience as a software engineer or in a deeply technical role. Comfortable reading code, reviewing architecture decisions, and engaging in technical design discussions with engineering teams.
β’
SRE & Infrastructure Expertise: Strong understanding of site reliability principles, cloud infrastructure, database reliability, container orchestration, and modern DevOps practices. Experience managing or closely partnering with SRE and DevOps teams.
β’
Risk Assessment & Data Proficiency: Strong analytical skills with the ability to use data sources (monitoring platforms, Pendo, Domo, Salesforce, incident logs) to prioritize reliability efforts by business impact.
β’
Communication Mastery: Ability to translate complex reliability and infrastructure data into clear narratives for leadership, product managers, and customer-facing teams. Experience leading incident reviews and high-visibility operational meetings is essential.
β’
SDLC & Release Lifecycle Knowledge: Deep understanding of software development lifecycles, release protocols, and incident response processes.
Problem Solving: Ability to identify the highest-leverage reliability investments and implement processes that improve platform stability without slowing engineering velocity.
Education
β’
B.S. or M.S. in computer science, software engineering, or a related technical field; comparable certifications.
β’
Or equivalent direct work experience, with a demonstrated track record in software engineering and/or site reliability engineering.
Cool Company Benefits:
- A dynamic, rapidly growing company, focused on helping organizations thriveΒ
- Medical, Dental, & Vision Insurance (for full-time employees)
- Competitive & Fair Pay
- Maternity & paternity leave (for full-time employees)
- Short & long-term disability
- Opportunity to learn from a dedicated leadership team
- Top-of-the-line company swag
Privacy Policy Notice
Filevine will handle your personal information according to whatβs outlined in ourΒ Privacy Policy.
Communication about this opportunity, or any open role at Filevine, will only come from representatives with email addresses using "filevine.com". Other addresses reaching out are not affiliated with Filevine and should not be responded to.