We are currently actively building a Data Warehouse a key part of the product. We work with cutting edge technologies (GCP, AWS, Airflow, Kafka, K8s) and make infrastructure and architectural decisions based on data. We are building a large scale data infrastructure for analytics, machine learning, and realtime recommendations.
Our tech stackLanguages: Python, SQL
Frameworks: Spark, Apache Beam
Storage and analytics: BigQuery, GCS, S3, Trino, other GCP and AWS stack
components Integration: Apache Kafka, Google Pub/Sub, Debezium
ETL: Airflow 2
Infrastructure: Kubernetes, Terraform
Development: GitHub, GitHub Actions, Jira
- Gather and clarify requirements from diverse stakeholders across the company.
- Design and evolve DWH Architecture (ODS and Data Mart layers) with a focus on scalability, performance, and data security standards.
- Build robust and efficient incremental pipelines; develop and optimize data marts in BigQuery (Dataform/SQL/DBT) and Airflow.
- Participate in testing, data validation, and release processes
- Design and implement data quality checks; investigate data quality issues and consistency discrepancies across various pipelines.
- Perform deep-dive analysis of source systems to build efficient data flows from source to consumption.
- Maintain architectural and technical documentation to ensure data transparency and compliance.
- 3+ years of experience as an Analytics Engineer / DWH Engineer / Data Analyst working with DWH
- Hands-on experience with data warehouses
- Strong understanding of DWH architecture and data layers (ODS, Data Marts)
- Understanding of incremental loads, historical data handling, and deduplication
- Strong SQL skills
- Experience designing and optimizing data marts
- Experience with BigQuery (partitioning, clustering, cost-aware querying)
- Experience with Airflow or similar orchestration tools
- Python for data processing and ETL tasks
- Ability to work with stakeholders and translate business needs into data requirements
Must have to be familiar with:- Languages: SQL (strong knowledge), Python (basic knowledge)
- Orchestration: Airflow or similar orchestration tool
- Version Control: Git
- Experience with Data Quality / Data Governance / SLAs
Nice to be familiar with:- Cloud & Storage: Google Cloud Platform (BigQuery, Cloud Storage, Dataform, DBT)
- Stable salary, official employment;
- Health insurance;
- Hybrid work mode and flexile schedule;
- Relocation package offered for candidates from other regions;
- Access to professional counseling services including psychological, financial, and legal support;
- Discount club membership;
- Diverse internal training programs;
- Partially or fully payed additional training courses;
- All necessary work equipment.