Hi Everyone,
Role- Data Engineer
Location: FULLY ONSITE-Β Las Vegas, NV- NO EXCEPTIONS
Develop and maintain data pipelines, ELT processes, and workflow orchestration using Apache Airflow, Python and PySpark to ensure the efficient and reliable delivery of data.
Design and implement custom connectors to facilitate the ingestion of diverse data sources into our platform, including structured and unstructured data from various document formats .
Collaborate closely with cross-functional teams to gather requirements, understand data needs, and translate them into technical solutions.
Implement DataOps principles and best practices to ensure robust data operations and efficient data delivery.
Design and implement data CI/CD pipelines to enable automated and efficient data integration, transformation, and deployment processes.
Monitor and troubleshoot data pipelines, proactively identifying and resolving issues related to data ingestion, transformation, and loading.
Conduct data validation and testing to ensure the accuracy, consistency, and compliance of data.
Stay up-to-date with emerging technologies and best practices in data engineering.
Document data workflows, processes, and technical specifications to facilitate knowledge sharing and ensure data governance.
Experience:
Bachelorβs degree in computer science, Engineering, or a related field
8 + yearβs experience in data engineering, ELT development, and data modeling.
Proficiency in using Apache Airflow and Spark for data transformation, data integration, and data management.
Experience implementing workflow orchestration using tools like Apache Airflow, SSIS or similar platforms.
Demonstrated experience in developing custom connectors for data ingestion from various sources.
Strong understanding of SQL and database concepts, with the ability to write efficient queries and optimize performance.
Experience implementing DataOps principles and practices, including data CI/CD pipelines.
Excellent problem-solving and troubleshooting skills, with a strong attention to detail.
Understanding of distributed systems and working with large-scale datasets.
Familiarity with data governance frameworks and practices.
Knowledge of data streaming and real-time data processing technologies (e.g., Apache Kafka).
Excellent analytical and problem-solving skills, with a keen attention to detail.
Required Skills β
Python, Pyspark, Sql, Airflow, Trino, Hive, Snowflake, Agile Scrum
Good to haveβLinux, Openshift, Kubernentes, AI, Superset
Thanks, and Regards
Vamsi Binginapalli
US IT RecruiterΒ
[Upgrade to PRO to see contact]
Livemindz/She-Jobs.
1505 LBJ Freeway, Suit # 245, Farmers Branch TX 75234