Project description
Our partner - a top tier Hedge Fund - keeps expanding AI workflows. We are looking for a platform infrastructure engineer to design, build, and operate cloud-native infrastructure for deploying AI workloads safely and reliably at scale.
Responsibilities
- The engineer will design container architectures for security-sensitive and multi-tenant environments, implement network policies and access controls for workload isolation, build CI/CD pipelines for reliable deployment, and establish infrastructure-as-code practices for reproducibility and auditability. They will work closely with AI engineers, but their primary focus is the platform itself — making it secure, scalable, and operationally excellent.
SKILLS
Must have
- Deep hands-on experience with Kubernetes (EKS preferred) — cluster design, networking (CNI, service mesh), RBAC, and security policies
- Strong AWS expertise — VPC architecture, IAM, ECS/EKS, networking, and security groups
- Proficiency with Infrastructure-as-Code tools such as Terraform, CloudFormation, or Pulumi
- Experience designing container architectures for multi-tenant or security-sensitive workloads
- Familiarity with GitOps workflows and CI/CD platforms (ArgoCD, GitHub Actions, Jenkins)
- BS + 5 years or MS + 3 years in a platform/infrastructure/DevOps engineering role
- Python and/or Go for tooling and automation
- Up-to-date with the latest advancements in cloud-native infrastructure, container security, and AI workload deployment
Nice to have
• Familiarity with AI infrastructure —agent orchestration frameworks, MCP
• Experience in the financial domain — regulated environments, compliance-aware infrastructure
• Experience with container orchestration platforms for autonomous or long-running AI workloads
• Prior experience securing production AI agent workloads at scale