Project description
You will join a core engineering team within one of the leading U.S. video‑content providers, supporting hundreds of distributed software engineering teams and operating across a large‑scale infrastructure that spans internal data centers, broadcast facilities, and public cloud (AWS). The project focuses on building agentic retrieval workflows, data pipelines, and ML/LLM‑driven scoring systems that enhance engineering productivity and code‑quality metrics. You will design retrieval agents, optimize LLM prompting strategies, build robust orchestration, and integrate backend services that process PR, commit, and code‑quality metadata at scale. Target Background: 5-10+ years as a software/data/ML engineer. Strong Python expertise and familiarity with modern data/ML tooling. Proven delivery of production-grade orchestration systems (Dagster/Airflow) on AWS. Practical experience with prompt/LLM optimization (DSPy preferred). Comfortable working with Git/PR metadata, code diffs, backfills, and performance‑critical data pipelines.
Responsibilities
SKILLS
Must have
Nice to have
- Experience with code-scoring/effort or quality metrics; SonarQube integration. - Graph-ish linking across artifacts (Jira/PR/commit/branch) for epic/work-type classification. - Experience tuning performance/cost for LLM calls (OSS LLM evals, batching, caching). - Observability for pipelines (metrics/traces/logs) and data quality alerting.