ABOUT VOODOO
Founded in 2013, Voodoo is a tech company that creates mobile games and apps with a mission to entertain the world. Gathering 800 employees, 7 billion downloads, and over 200 million active users, Voodoo is the #3 mobile publisher worldwide in terms of downloads after Google and Meta. Our portfolio includes chart-topping games like Mob Control and Block Jam, alongside popular apps such as BeReal and Wizz.
TEAM
The Engineering & Data team builds innovative tech products and platforms to support the impressive growth of their gaming and consumer apps which allow Voodoo to stay at the forefront of the mobile gaming industry.
The Voodoo Ad-Network is an autonomous product group of around 60 highly driven professionals with an ambitious mission: building top-tier ad network services. Our primary goal is to leverage Voodooβs massive first-party data ecosystem to optimize and scale monetization. We are in a rapid growth phase, expanding into new ventures such as opening to external inventory, penetrating the external advertiser market, and driving social network monetization following our recent acquisition of BeReal. To support this incredible trajectory and promising early results, we are scaling our team.
The Feature Platform Team is the foundational infrastructure engine empowering our ML Ads Recommendation capabilities. Drawing inspiration from industry-leading feature stores, we build and maintain the unified data layer for our machine learning features. Our mission is to accelerate the ML lifecycle by providing a unified, scalable, and highly available architecture for computing, storing, and serving batch, real-time, and on-demand features.
Beyond just building the infrastructure, we are a highly proactive team that continuously explores new data signals and feature engineering opportunities to push the boundaries of our targeting performance.
This role is a hybrid position, either based in Helsinki, Paris or Strasbourg.
ROLE
Weβre looking for a Senior Data Engineer to join our Feature Platform Team. You will be joining a dedicated squad of Data and ML Engineers focused on guaranteeing consistency across offline training and online inference, eliminating training-serving skew, and enabling our Data Scientists to seamlessly and rapidly deploy the next generation of high-performing models.
In this role as a Senior Data Engineer, your scope extends far beyond classical data engineering. You will be responsible for managing both the offline and online components of our machine learning architecture. This means bridging massive-scale data processing (handling both heavy batch jobs and subsecond real-time feature updates) with high-load online services that must process and return features for inference with strict low latency.
- Architectural Ownership: Take end-to-end ownership of highly visible projects from initial ideation to production release. This includes feature scoping, timeline estimation, architecture design, and benchmarking next-generation technologies.
- Proactive Data Innovation: Go beyond passive implementation by actively partnering across the entire data lifecycle. You will deeply understand the downstream Data Science domain to discover high-impact feature opportunities, while also collaborating closely with upstream data engineering teams to understand ingestion mechanisms (right up to the SDK and Bidding platforms) to unlock and integrate new data signals.
- ML Infrastructure & Feature Platform: Collaborate closely with Data Scientists and ML Engineers to design, scale, and optimize core components spanning both offline training and online inference, including our Feature Store (supporting batch, streaming, and low-latency on-demand computation) and engines for on-demand training dataset generation.
- Pipeline Engineering: Build, maintain, and optimize mission-critical data pipelines spanning both extensive batch processing and continuous real-time streams (ensuring subsecond feature updates) to adapt to ever-evolving business and machine learning needs.
- High-Performance Online Services: Actively build and maintain the high-load backend applications that power our ML model serving, ensuring they can process and return features with low latency and high availability under heavy traffic.
- Scalability & Performance: Work hand-in-hand with our infrastructure teams to guarantee the reliability, security, and immense scalability required for an ad-network ecosystem.
- Agile Collaboration: Thrive in a fast-paced agile environment with rapid decision-making processes. You will collaborate daily with back-end developers, data scientists and product managers.
- Mentorship & Team Culture: You will actively contribute to our engineering culture, share knowledge, and ensure every team member feels comfortable, supported, and empowered to grow in their role.
PROFILE
We are looking for a Senior Data Engineer who deeply understands both the data lifecycle and the specific challenges of putting machine learning models into production at scale.
- 6+ years of proven experience as a Data Engineer, ML Engineer, Backend Engineer, or a closely related role in a high-scale environment.
- Big Data & Streaming Mastery: Extensive hands-on experience working with Flink or Spark at scale. Deep expertise in Flink (or similar stateful streaming platforms) as you will be a key contributor in scaling our real-time streaming architecture.
- Coding Proficiency: Advanced expertise in Python for robust ETL pipelines and custom feature-definition SDKs/DSLs
- Experience or a strong willingness to work with Golang for building high-performance, low-latency backend applications as well as familiarity with Java is highly valued for our Flink streaming workloads.
- Data Architecture: Deep understanding of modern Data Lakehouse design principles, open table formats (like Iceberg), optimization techniques, and data modeling.
- Cloud & DevOps: Strong hands-on experience with AWS.
- Familiarity with DBT for data pipeline transformation is nice to have.
- ML Production Awareness: You have a solid grasp of the unique challenges involved in running ML models in production, including working with Feature Stores, mitigating training-serving skew, and model monitoring.
- System Design: You are highly familiar with topics surrounding system scalability, high availability/reliability, low-latency API design, and security best practices.
OUR STACK
- Languages: Python (ETL & SDKs), Golang (High-Performance Online Services), Java (Flink)
- Processing & Orchestration: Spark, Flink (Real-time Streaming), Airflow, DBT
- Storage & Infrastructure: Apache Iceberg, Amazon Web Services (AWS), Kubernetes, Terraform
BENEFITS
- Competitive salary based on experience
- Swile Lunch voucher
- Gymlib (100% covered by Voodoo)
- Premium healthcare coverage with SideCare, 100% covered for you and your family
- Wellness activities in our Paris office
- Remote Fridays