#hiring Kubernetes Big Data Engineer
Apply here: [Upgrade to PRO to see link]
Job DescriptionJob DescriptionDescription:Hybrid 3 days onsite / 2 days remote in Rockville, MDOur client seeks a Big Data Engineer to design and optimize large-scale data processing on AWS with Spark and Kubernetes. The role will implement containerized workloads on EMR on EKS, build scalable data pipelines, and improve performance, reliability, and observability. The engineer will collaborate with cross-functional teams, apply Spark tuning expertise, and manage Kubernetes-based infrastructure to support data-driven outcomes. Financial industry experience is beneficial.We can facilitate w2 and corp-to-corp consultants. For our w2 consultants, we offer a great benefits package that includes Medical, Dental, and Vision benefits, 401k with company matching, and life insurance.Rate: $54.00 to $64.00/hr. w2JN -58Responsibilities:Design, develop, and maintain large-scale data processing pipelines using Hadoop, Spark, Python, and Scala.Architect and deploy containerized big data workloads on Amazon EMR on EKS.Design and implement Kubernetes-based infrastructure for running Spark applications at scale.Implement scalable ingestion, storage, transformation, and analysis solutions.Stay current with industry trends and emerging big data technologies to improve architecture.Collaborate with cross-functional teams to translate business requirements into technical solutions.Optimize and enhance existing data pipelines for performance, scalability, and reliability.Develop automated testing frameworks and implement continuous testing for data quality.Conduct unit, integration, and system testing for data pipeline robustness and accuracy.Support data scientists and analysts with reliable datasets and tooling.Write and maintain automated unit, integration, and end-to-end tests.Monitor and troubleshoot production data pipelines and resolve issues.Manage Kubernetes clusters, pods, services, and deployments for big data workloads.Experience Requirements:Hands-on experience with AI development tools such as GitHub Copilot, Q Developer, ChatGPT, or Claude.Proficiency with Hadoop, Spark, Hive, and Trino.Experience addressing data skew, petabyte-scale processing, and remediation of resource, data quality, and scalability issues.Strong Kubernetes experience including pods, services, deployments, namespaces, ConfigMaps, and Secrets.Hands-on EMR on EKS experience for Spark workloads.Kubernetes resource management, scheduling, and auto-scaling expertise.Knowledge of Helm charts, Kubernetes networking, PVs/PVCs, security best practices, kubectl, and YAML manifests.Ability to troubleshoot cluster issues, pod failures, resource constraints, and Spark-on-Kubernetes integration with dynamic allocation.Prompt engineering, AI workflow design, AI-driven analysis, and change management for AI adoption.Deep understanding of Spark internals including executors, tasks