Responsibilities

  • Lead end to end data migration project from on-premises environments to Databricks with minimal downtime.
  • Work with architects and lead solution design to meet functional and non-functional requirements.
  • Hands on experience in Databricks to design and implement the solution on AWS.
  • Hands on experience in configuring Databricks clusters, writing Pyspark codes, build CI/CD pipelines for the deployments.
  • Highly experienced in optimization techniques (Zordering, Auto Compaction, vacuuming)
  • Process near real time data through Auto Loader, DLT pipelines
  • Must have strong background in python and able to identify, communicate and mitigate risks and issues.
  • Identify and resolve data-related issues and provide support to ensure data availability and integrity.
  • Optimize AWS, Databricks resource usage to control costs while meeting performance and scalability requirements.
  • Stay up to date with AWS, Databricks services, and data engineering best practices to recommend and implement new technologies and techniques.
  • Proactively implement engineering methodologies, standards, and leading practices.

Requirements

  • Bachelor’s or master’s degree in computer science, data engineering, or a related field.
  • Minimum 5 years of experience in data engineering, with expertise in AWS or Azure services, Databricks, and/or Informatica IDMC.
  • Proficiency in programming languages such as Python, Java, or Scala for building data pipelines.
  • Evaluate potential technical solutions and make recommendations to resolve data issues especially on performance assessment for complex data transformations and long running data processes.
  • Strong knowledge of SQL and NoSQL databases.
  • Familiarity with data modelling and schema design.
  • Excellent problem-solving and analytical skills.
  • Strong communication and collaboration skills.
  • Databricks certifications, and Informatica certifications are a plus.

Preferred Skills:
  • Experience with big data technologies like Apache Spark and Hadoop on Databricks.
  • Experience in AWS Services focusing on data and architecting.
  • Knowledge of containerization and orchestration tools like Docker and Kubernetes.
  • Familiarity with data visualization tools like Tableau or Power BI.
  • Understanding of DevOps principles for managing and deploying data pipelines.
  • Experience with version control systems (e.g., Git) and CI/CD pipelines.
  • Knowledge of data governance and data cataloguing tools, especially Informatica IDMC.


Shortlisted candidates will be offered a 6 months / 1 Year Agency Contract employment