Senior AI Engineer with Databricks

EPAM·Kazakhstan, Georgia·Удалённо·вчера

We are looking for a Senior AI Engineer with Databricks expertise to design, deploy and maintain scalable machine learning pipelines using the Databricks platform. In this role, you will deliver production-ready ML pipelines, automated training and retraining workflows, deployed models, monitoring dashboards and CI/CD pipelines for ML systems.

Responsibilities

Design, implement and maintain end-to-end ML pipelines on Databricks
Build workflows for data ingestion, preprocessing, feature engineering, training and inference
Leverage PySpark, Spark ML and Databricks notebooks/jobs
Manage model versioning, experiment tracking and reproducibility using MLflow
Package and deploy models for batch and real-time inference
Monitor model performance, drift and retraining cycles
Develop scalable ETL/ELT pipelines using Databricks Delta Lake
Optimize data storage and access patterns through partitioning, Z-ordering and caching
Integrate with data sources such as Azure Data Lake, S3, APIs and databases
Implement CI/CD pipelines for ML workflows using Azure DevOps, GitHub Actions and Databricks Repos and Jobs API
Configure clusters, autoscaling and cost optimization while applying Infrastructure as Code with Terraform, ARM and Bicep
Implement logging, alerting and observability to ensure high availability and fault tolerance of ML systems

Requirements

3+ years of experience in machine learning engineering or related roles
Expertise in the Databricks platform including workspaces, jobs and clusters
Proficiency in Apache Spark, PySpark and Python with pandas and scikit-learn
Skills in MLflow for tracking, registry and deployment
Competency in CI/CD pipelines, Docker containerization and REST APIs for model serving
Familiarity with version control using Git
Background in Azure including Azure Databricks, ADLS, ACR and AML
Knowledge of data preprocessing, feature engineering and model training and evaluation
Understanding of libraries such as XGBoost, LightGBM and CatBoost
English proficiency at B2 level or higher

Nice to have

Familiarity with AWS including S3, EMR and SageMaker
Skills in streaming pipelines with Spark Structured Streaming and Databricks Feature Store
Knowledge of Kubernetes
Competency in monitoring tools such as Prometheus and Grafana
Experience with large-scale production systems