We are looking for a skilled Senior Data Platform Engineer with a strong Python background to join our team. This role is ideal for a data engineering professional experienced in delivering robust data platforms and implementing data fabric solutions. As a Senior Data Platform Engineer, you will be instrumental in building and optimizing the data infrastructure on AWS for our customers, ensuring high performance, data integrity, and secure access across data products.
Req: 997715695
Responsibilities
- Design and build scalable secure data platforms on AWS using services such as EMR, Glue Catalog and Lake Formation
- Develop and maintain efficient data pipelines using Python, SQL and Spark with a focus on performance and reliability
- Implement open table formats like Hudi or Iceberg to support modern data lake architectures
- Automate infrastructure and deployment using Terraform ensuring consistent and repeatable environments
Requirements
- Extensive experience with AWS data ecosystem services such as Lake Formation, EMR, Glue, EC2 and CloudWatch including the configuration and optimization of these services for data processing with a strong focus on security, scalability and logging
- Solid understanding and hands-on experience with open table formats such as Hudi or Iceberg
- Proven track record of coding and software development within high-performing data engineering teams with a focus on delivering data products at scale
- Demonstrated experience with cloud automation and deployment using tools like Terraform
- Strong SQL and Spark skills for effective data transformation and processing
- Ability to enforce and uphold data quality through validation, cleansing and governance best practices to ensure data accuracy and consistency
Nice to have
- Experience in the financial industry particularly in investment banking
- Knowledge of additional data processing libraries and tools
- Expertise in one of the major real-time data processing frameworks such as Apache Flink or Kafka Streams
- Experience in building event-driven and/or streaming data services