You will build the foundations every other AI engineer at the company depends on: the model gateway, the evaluation harness, the observability stack, the safety guardrails and the internal accelerators that turn what would be a six-week setup into a one-day setup.
This is the role most firms forget to invest in — and quietly pay for, in slow engagements, inconsistent quality, and engineers reinventing the same plumbing on every project. We are not making that mistake.
Responsibilities
- Design, build, and operate the shared platform that our AI engineering teams use across every customer engagement. This includes: a model gateway (multi-provider routing, fallback, cost and rate-limit controls), an evaluation harness, prompt and dataset versioning, observability and tracing for AI workloads, safety and policy guardrails, and a small set of internal accelerators (templates, libraries, scaffolding)
- Make the platform a force multiplier. Your work is successful when an engineer joining a new engagement can be in production with a credible AI feature in days, not weeks
- Own the platform as a product. Set roadmaps, gather feedback from internal users, write good docs, run office hours, deprecate things that have stopped pulling their weight
- Set the standards for how we evaluate, monitor, and operate AI systems — and make those standards easy to follow because the platform makes the right thing the default
- Partner closely with the Learning & Evaluations function and with security, legal, and data-governance teams. Make the boring-but-essential things — audit trails, data lineage, tenancy, secret handling — work the same way every time
- Contribute to our hiring bar for platform engineers across the practice
Requirements
- Substantial experience building and operating platforms that other engineers depend on — internal developer platforms, ML platforms, data platforms, or similar
- Hands-on familiarity with the AI infrastructure problem space: model providers, vector databases, embedding models, agent frameworks, evaluation tools, prompt management, observability for LLM workloads
- Strong systems engineering: distributed systems, latency/cost tradeoffs, reliable service design, infrastructure-as-code, CI/CD
- A product mindset. You measure success by adoption and by the time-to-first-value of internal users — not by how clever the architecture is
- Excellent written communication. You will write the docs, the design proposals, and the post-mortems that the rest of the practice depends on
- Pragmatism. You build the simplest thing that will work, and you make it possible to replace later
Nice to have
- Experience with one or more of: Kubernetes, Terraform, Pulumi, Temporal, observability stacks (OpenTelemetry, Datadog, Honeycomb), policy engines (OPA), secrets management
- Prior experience as the founding platform engineer on a fast-growing AI team
- Background in security, compliance, or data governance for AI systems
- Open-source contributions, especially to AI infrastructure projects