We are seeking a skilled Generative AI Platforms Architect to lead and define the architecture for our enterprise GenAI platform.
The role involves creating reference architectures, guardrails, and roadmaps to deliver scalable and secure AI solutions—including LLMs, agentic applications, and tool integrations—across multiple clouds. By collaborating with engineering, security, data governance, and product teams, the Architect will translate business goals into platform designs and actionable delivery plans, while mentoring engineers and advocating for best practices in LLMOps/ModelOps.
Responsibilities
- Design enterprise generative AI reference architectures, blueprints, and reusable patterns
- Define multi-cloud platform foundations addressing networking, identity, and secrets management
- Lead efforts in LLMOps/ModelOps, focusing on model evaluation, safety, observability, and rollout strategies
- Create frameworks and governance for agentic systems, including tool governance and the Model Context Protocol (MCP)
- Ensure systems comply with security, risk, and compliance standards, with Responsible AI principles and PII controls applied
- Establish strategies for cost efficiency, reliability, and performance using capacity planning and FinOps techniques
- Improve developer experience through CI/CD pipelines, golden paths, templates, and Infrastructure as Code (IaC) techniques
- Collaborate with cross-functional teams to align system requirements with strategic objectives
Requirements
- Bachelor’s degree in Computer Science, Engineering, or related field, or equivalent experience
- Proven experience in architecture roles involving cloud, data, or production Generative AI/LLM systems
- Strong knowledge of cloud platforms (either Azure, AWS or GCP) and Infrastructure as Code tools like Terraform, Bicep, CDK, CloudFormation
- Proficiency with containerization, orchestration, and API management (Docker, Kubernetes, API gateways/service meshes)
- Experience with CI/CD and release management for ML/LLM workloads (Jenkins, GitHub Actions, GitLab CI, Azure DevOps)
- Comprehensive expertise with LLM-based solutions, including deploying and operating LLM inference (e.g., vLLM, Triton, TGI, Ray Serve, KServe/Seldon), LLM/app tracing and metrics (e.g., OpenTelemetry, Langfuse, Arize Phoenix, WhyLabs), and building evaluation pipelines (offline/online, regression suites)
- Ability to design and oversee data and retrieval pipelines (embedding generation, indexing/refresh strategies, vector DBs such as Pinecone, Weaviate, Milvus, FAISS, and relevance monitoring)
- Experience architecting secure, scalable agentic systems, including multi-agent workflows (LangGraph, CrewAI, AutoGen-like), state management, retries, rate limits, tool-failure handling, step-level auditing, and integration with external tools via Model Context Protocol (MCP) or similar standards
- Proven track record implementing security and compliance guardrails: secrets isolation, tool/API permissions, prompt-injection defenses, data leakage prevention, PII redaction, policy enforcement, and operating tool registries
- Advanced proficiency in English (B2+/C1)