We are looking for a Senior Data Engineer to build and operate scalable data ingestion and CDC capabilities on our Azure-based Lakehouse platform. Beyond developing pipelines in Azure Data Factory and Databricks, you will help us mature our engineering approach: we increasingly deliver ingestion and CDC preparation through Python projects and reusable frameworks, and we expect this role to apply professional software engineering practices (clean architecture, testing, code reviews, packaging, CI/CD, and operational excellence). 
Our platform runs batch-first processing, with streaming sources landed raw and processed in batch and selective evolution toward streaming where needed. 
You will work within the Common Data Intelligence Hub, collaborating with data architects, analytics engineers, and solution designers to enable robust data products and governed data flows across the enterprise. 
β’ Your team owns ingestion & CDC engineering end-to-end (design, build, operate, observability, reliability, reusable components). 
β’ You contribute to platform standards (contracts, layer semantics, readiness criteria) and reference implementations. 
β’ You do not primarily own cloud infrastructure provisioning (e.g., enterprise networking, core IaC foundations), but you collaborate with the platform team by defining requirements, reviewing changes, and maintaining deployable code for pipelines and jobs. Platform data engineering & delivery 
β’ Design and develop ingestion pipelines using Azure and Databricks services (ADF pipelines, Databricks notebooks/jobs/workflows). 
β’ Implement and operate CDC patterns (inserts, updates, deletes), including late arriving data and reprocessing strategies. 
β’ Structure and maintain bronze and silver Delta Lake datasets (schema enforcement, de-duplication, performance tuning). 
β’ Build βtransformation-readyβ datasets and interfaces (stable schemas, contracts, metadata expectations) for analytics engineers and downstream modeling. 
β’ Ingest data in a batch-first approach (raw landing, replayability, idempotent batch processing), and help evolve patterns toward true streaming where future use cases require it. Software engineering for data frameworks 
β’ Develop and maintain Python-based ingestion/CDC components as production-grade software (modules/packages, versioning, releases). 
β’ Apply engineering best practices: code reviews, unit/integration tests, static analysis, formatting/linting, type hints, and clear documentation. 
β’ Establish and improve CI/CD pipelines for data engineering code and pipeline assets (build, test, security checks, deploy, rollback patterns). 
β’ Drive reuse via shared libraries, templates, and reference implementations; reduce βone-off notebookβ solutions. Operations, reliability & observability 
β’ Implement logging, metrics, tracing, and data pipeline observability (run-time KPIs, SLAs, alerting, incident readiness). 
β’ Troubleshoot distributed processing and production issues end-to-end. 
β’ Work with solution designers on event-based triggers and orchestration workflows; contribute to operational standards. 
β’ Implement operational and security hygiene: secure secret handling, least-privilege access patterns, and support for auditability (e.g., logs/metadata/lineage expectations). Collaboration & leadership 
β’ Mentor other engineers and promote consistent engineering practices across teams. 
β’ Contribute to the Data Engineering Community of Practice and help define standards, patterns, and guardrails. 
β’ Contribute to architectural discussions (layer semantics, readiness criteria, contracts, and governance). 
β’ Work with architects and governance stakeholders to ensure datasets meet governance requirements (cataloging, ownership, documentation, access patterns, compliance constraints) before promotion to higher layers.