WHO WE ARE At TwelveLabs, we are pioneering the development of cutting-edge multimodal foundation models that have the ability to comprehend videos just like humans do. Our models have redefined the standards in video-language modeling, empowering us with more intuitive and far-reaching capabilities, and fundamentally transforming the way we interact with and analyze various forms of media. With a $110+ million in Seed and Series A funding, our company is backed by top-tier venture capital firms such as NVIDIA’s NVentures, NEA, Radical Ventures, and Index Ventures, and prominent AI visionaries and founders such as Fei-Fei Li, Silvio Savarese, Alexandr Wang and more. Headquartered in San Francisco, with an influential APAC presence in Seoul, our global footprint underscores our commitment to driving worldwide innovation. Our partnership with NVIDIA and AWS gives us access to the most advanced chips, including B300s, enabling us to push the boundaries of what's possible in video AI. We are a global company that values the uniqueness of each person’s journey. It is the differences in our cultural, educational, and life experiences that allow us to constantly challenge the status quo. We are looking for individuals who are motivated by our mission and eager to make an impact as we push the bounds of technology to transform the world. Join us as we revolutionize video understanding and multimodal AI. ABOUT THE TEAM The Pegasus team sits at the core of TwelveLabs' video understanding capabilities and is responsible for driving Pegasus, our Video Analysis product. Our focus is on developing multimodal video analysis systems that are designed for high instruction following capability and producing highly complex, hierarchically structured outputs. We focus on shipping products with real-world value rather than doing research in isolation, and we work in a goal-oriented, cross-functional team that encompasses both ML researchers and engineers. Our work covers a broad range of challenges: large-scale distributed training of multi-modal LLMs that span from pre-training to RL, accurate temporal segmentation and structured metadata extraction for real-world use cases, extending temporal context length to multiple hours, and data curation processes that enable well-aligned evaluation and performance improvements through training data enhancements. Our team has access to the most advanced chips in the world, including NVIDIA B300s, to push the boundaries of video analysis systems—accelerating our research-to-production cycle as fast as possible. IN THIS ROLE, YOU WILL - Drive technical direction for training infrastructure and training operations within Pegasus while remaining deeply hands-on in critical system design and implementation. - Own the design and evolution of scalable end-to-end training pipelines, with a focus on reliability, reproducibility, efficiency, and fast iteration in large-scale distributed environments. - Lead technical decision-making across data curation workflows, training systems, evaluation pipelines, and ML infrastructure for multimodal model development. - Improve and automate the end-to-end training lifecycle so research ideas can be translated into robust systems and integrated into production model development quickly and reliably. - Mentor engineers and raise the team’s execution bar through strong technical judgment, design reviews, and hands-on collaboration. - Explore and adopt AI-assisted development tools such as Claude, Gemini, and GPT to improve productivity across coding, experimentation, debugging, and documentation. YOU MAY BE A GOOD FIT IF YOU HAVE - Significant experience building and productionizing large-scale ML systems as a hands-on individual contributor. - Experience driving technical direction across complex ML infrastructure or training systems projects and making architectural decisions in demanding engineering environments. - Strong experience with large-scale distributed training systems, training infrastructure, or large-scale data processing pipelines. - Strong foundations in machine learning and experience with multimodal systems such as vision, language, or video-based models. - Strong technical judgment across system design, performance, reliability, reproducibility, and long-term maintainability. - A track record of mentoring engineers and creating technical leverage beyond your own individual contributions. PREFERRED QUALIFICATIONS - Experience building infrastructure for large-scale data curation, evaluation, or training workflows. - Experience optimizing distributed training systems in high-performance GPU environments. - Experience working with cutting-edge accelerator hardware and large-scale multimodal model training. - Master’s or PhD in Machine Learning, Computer Science, or a related technical field. HIRING PROCESS Application Review → Recruiter Interview (비대면/30분) → Coding test → Hiring Manager Interview(비대면/30분) → Live Coding Test Interview (대면/135분) → System Design(비대면/105분) → Final Round 인터뷰(비대면/30분) → Reference Check → Offer BENEFITS AND PERKS - 글로벌 B2B 고객과 함께 성장하는 Global Team - 자율성과 협업을 모두 갖춘 하이브리드 근무 - 전 직원에게 맥북 및 70만 원 상당 재택근무 장비 지원, 3년 주기로 최신 장비 교체 - 식사·교통비 등 자유롭게 사용할 수 있는 월 60만 원 한도 법인카드 제공 - 사무실 내 스낵바(간식, 커피, 신선식품 제공) - 연말 2주간 겨울방학 운영 - 연 1회 건강검진 지원 - 영어교육 프로그램 지원

Staff Machine Learning Engineer - TrainingOps at Twelve Labs

Similar Engineering Jobs

Machine Learning Engineer

Senior GenAI Engineer

Account Manager - Automation & Motion Control

Share this job

About Twelve Labs

Principal Platform Engineer - Authentication

Sr. Full Stack Engineer

SRE / Platform Reliability Architect

Translation Jobs

Popular Skills

Jobs by Salary

For Job Seekers

For Employers