Generative AI Systems Architect

EPAM·Kazakhstan, Uzbekistan, Armenia, Georgia·Удалённо, Офис·5д. назад

We are seeking a skilled Generative AI Platforms Architect to lead and define the architecture for our enterprise GenAI platform.

The role involves creating reference architectures, guardrails, and roadmaps to deliver scalable and secure AI solutions—including LLMs, agentic applications, and tool integrations—across multiple clouds. By collaborating with engineering, security, data governance, and product teams, the Architect will translate business goals into platform designs and actionable delivery plans, while mentoring engineers and advocating for best practices in LLMOps/ModelOps.

Responsibilities

Design enterprise generative AI reference architectures, blueprints, and reusable patterns
Define multi-cloud platform foundations addressing networking, identity, and secrets management
Lead efforts in LLMOps/ModelOps, focusing on model evaluation, safety, observability, and rollout strategies
Create frameworks and governance for agentic systems, including tool governance and the Model Context Protocol (MCP)
Ensure systems comply with security, risk, and compliance standards, with Responsible AI principles and PII controls applied
Establish strategies for cost efficiency, reliability, and performance using capacity planning and FinOps techniques
Improve developer experience through CI/CD pipelines, golden paths, templates, and Infrastructure as Code (IaC) techniques
Collaborate with cross-functional teams to align system requirements with strategic objectives

Requirements

Bachelor’s degree in Computer Science, Engineering, or related field, or equivalent experience
Proven experience in architecture roles involving cloud, data, or production Generative AI/LLM systems
Strong knowledge of cloud platforms (either Azure, AWS or GCP) and Infrastructure as Code tools like Terraform, Bicep, CDK, CloudFormation
Proficiency with containerization, orchestration, and API management (Docker, Kubernetes, API gateways/service meshes)
Experience with CI/CD and release management for ML/LLM workloads (Jenkins, GitHub Actions, GitLab CI, Azure DevOps)
Comprehensive expertise with LLM-based solutions, including deploying and operating LLM inference (e.g., vLLM, Triton, TGI, Ray Serve, KServe/Seldon), LLM/app tracing and metrics (e.g., OpenTelemetry, Langfuse, Arize Phoenix, WhyLabs), and building evaluation pipelines (offline/online, regression suites)
Ability to design and oversee data and retrieval pipelines (embedding generation, indexing/refresh strategies, vector DBs such as Pinecone, Weaviate, Milvus, FAISS, and relevance monitoring)
Experience architecting secure, scalable agentic systems, including multi-agent workflows (LangGraph, CrewAI, AutoGen-like), state management, retries, rate limits, tool-failure handling, step-level auditing, and integration with external tools via Model Context Protocol (MCP) or similar standards
Proven track record implementing security and compliance guardrails: secrets isolation, tool/API permissions, prompt-injection defenses, data leakage prevention, PII redaction, policy enforcement, and operating tool registries
Advanced proficiency in English (B2+/C1)

Generative AI Systems Architect

Responsibilities

Requirements

Похожие вакансии