Our team is looking to hire a talented Senior/Lead Data Quality Engineer. In this role, you will be instrumental in safeguarding the integrity, reliability, and quality of our data engineering platforms and pipelines.
Responsibilities
Build and architect a strong automated testing strategy and framework for data engineering pipelines, emphasizing integration with Databricks, PySpark, Scala, Spark SQL, and other pertinent technologies
Deploy automated tests that guarantee data quality, integrity, and reliability across every data pipeline and platform
Embed automated testing into Azure DevOps CI/CD pipelines to enable smooth testing, build, and deployment workflows
Carry out performance testing to evaluate the scalability and efficiency of data processing systems
Collaborate closely with data engineers and additional stakeholders to gather requirements and align testing strategies with business goals
Set up monitoring for automated tests and deliver routine reports covering test coverage, defects, and data quality concerns
Requirements
At least 3 years of experience working in Data Quality Engineering or an adjacent field
Strong grasp of data engineering principles and automated testing frameworks
Hands-on expertise with Databricks, PySpark, Scala, Spark SQL, Terraform, and Azure Event Hubs
Background in connecting testing frameworks to Azure DevOps and handling source control through Git
Understanding of pipeline configuration and deployment using Terraform
Working knowledge of Jira
Excellent analytical skills with a demonstrated history of diagnosing and resolving complex testing and data issues
English proficiency at a B2+ level
Nice to have
Exposure to Power BI, Azure Data Lake, Spark Streaming, and additional data visualization and processing tools
Previous work in a data-heavy industry or within large-scale data environments
Skill in delivering solutions via Infrastructure as Code (IaC)