Senior Machine Learning Engineer (LLM)

EPAM·Удалённо·Удалённо, Офис·2 мес. назад

EPAM Vietnam is seeking a Senior Machine Learning Engineer who will join our growing engineering team and work on cutting-edge ML solutions that impact users globally. You'll be at the forefront of building intelligent recommendation systems and production-scale machine learning infrastructure, while collaborating with talented engineers across international projects.

This is a high-impact role where you'll own the entire ML lifecycle, from architecting data pipelines to deploying models at scale. Beyond building exceptional technology, you'll have the chance to shape our technical direction and grow the next generation of ML engineers through mentorship.

Responsibilities

Design and build production-grade machine learning systems with a specialization in recommendation engines that serve millions of users
Develop and optimize high-performance data and model pipelines using Spark/PySpark to process massive datasets efficiently
Build Flask-based REST APIs that reliably serve models in production, ensuring low latency and high availability
Monitor model performance in real-world conditions and implement data-driven optimizations to enhance accuracy and efficiency
Identify and execute performance improvements across code, databases, and compute infrastructure
Work with cross-functional teams including product, data science, and platform engineering to deliver integrated solutions
Contribute to sprint planning, technical design reviews, and architectural decisions that move the team forward
Provide hands-on guidance and technical mentorship to 1–3 engineers, fostering their growth and development
Design and deploy Large Language Model applications leveraging vector databases (Pinecone, Faiss, PgVector) for intelligent search and retrieval

Requirements

Advanced proficiency in Python with a deep understanding of best practices for production ML code
Proven experience building scalable solutions with Spark/PySpark and handling large-scale data processing challenges
Solid grasp of machine learning principles, model development, and the complete ML lifecycle from experimentation to production
Hands-on experience building RESTful services with Flask or similar frameworks
Working knowledge of cloud platforms (Azure, AWS, or GCP) and cloud-native architectures
Strong software engineering practices, including Docker containerization, Git version control, and CI/CD automation
Demonstrated ability to tackle complex technical challenges, take ownership of outcomes, and thrive in collaborative team environments
Proficient in spoken and written English