Hi friends
Urgent hiring Please share at [Upgrade to PRO to see contact]
Job title: GKE Platform Engineering
Location: NCΒ
Job Description: OCP Gen AI
Role Summary
The role focuses on GKE platform engineering, infrastructure automation, security, and reliability, with working knowledge of GenAI services and guardrails to support GenAI workloads hosted on the platform. This is not a GenAI model-building roleβinstead, the engineer ensures that GenAI and nonβGenAI workloads run securely, reliably, and compliantly on GKE.
Key Responsibilities
GKE Platform Engineering
Design, deploy, and manage Google Kubernetes Engine (GKE) clusters for enterprise workloads.
Build and maintain shared Kubernetes platforms supporting multiple application teams.
Implement cluster-level capabilities such as:
Networking and ingress
Autoscaling and capacity planning
High availability and disaster recovery
Standardize GKE configurations following enterprise and security best practices.
Infrastructure as Code (IaC)
Provision and manage GCP infrastructure using Terraform.
Automate creation of:
GKE clusters
Networking, IAM, and service accounts
Supporting platform services
.
Cloud & Platform Operations
Operate and support production-grade GCP environments.
Implement monitoring, logging, and ing for GKE clusters and workloads.
Troubleshoot cluster, networking, and workload-level issues.
Optimize platform reliability, performance, and cost.
Security & Guardrails (GenAI-Aware Platform)
Implement and enforce GCP security guardrails, including:
Model Armor
Sensitive Data Protection (SDP)
Ensure platform compliance with:
Enterprise security standards
Data privacy and access controls
Support secure hosting of GenAI workloads on GKE, without owning model development.
GenAI Platform Enablement (Awareness-Level)
Maintain working knowledge of GCP GenAI services (e.g., Vertex AI) from a platform perspective.
Enable teams to deploy GenAI-enabled applications on GKE securely.
Understand GenAI concepts such as:
Inference workflows
Data sensitivity risks
Responsible AI constraints
Partner with application and AI teams to ensure GenAI workloads meet platform, security, and compliance requirements.
Automation & Scripting
Use Python for:
Platform automation
Operational tooling
Integration scripts
Support CI/CD pipelines for platform and application deployments.
Required Skills & Experience
Core Platform Skills
Strong hands-on experience with GCP / Azure or OCP (Openshift) platform
Deep experience with Google Kubernetes Engine (GKE).
Solid working knowledge of:
Kubernetes concepts (pods, services, ingress, autoscaling)
Cluster operations and troubleshooting
Experience supporting large-scale, multi-team Kubernetes environments.
Infrastructure & Automation
Proven experience using Terraform for IaC on GCP / Azure or OCP (Openshift) platform.
Proficiency in Python for automation and scripting.
CI/CD and Git-based workflows.
Security & Governance