We are seeking a Python Software Engineer to join our Production Data & Model Services team. In this role, you will build and operate production-grade Python applications, transform data science prototypes into deployable services and collaborate with platform teams to deliver robust data pipelines and APIs.
Responsibilities
- Build and run production-grade Python applications (APIs and batch jobs) with strong SDLC practices including code reviews, testing, CI/CD, observability and documentation
- Develop robust data pipelines (batch and near-real-time) reading and writing governed storage with Parquet/columnar formats and approved patterns
- Transform quant and data science prototypes into deployable packages/services (typed, modular, versioned)
- Expose scoring and analytics via APIs or scheduled jobs rather than notebook-only deliverables
- Collaborate with platform teams on Databricks/Spark connectivity
- Optimize PySpark workloads when needed
- Ensure release discipline through Git workflows, automated tests and code reviews
Requirements
- 3+ years of strong Python engineering experience including packaging (wheels/pyproject), typing and clean architecture
- Proficiency in error handling and performance-oriented development
- Proven production SDLC background with Git workflows, automated tests and CI/CD
- Expertise in Pandas and NumPy in production pipelines
- Familiarity with data formats like Parquet and governed data access patterns
- Experience building and operating APIs/services using FastAPI, Flask or similar frameworks
- Competency working in governed platform environments such as Databricks or containerized dev platforms
Nice to have
- Skills in scikit-learn for production feature and scoring pipelines, including reproducible transforms and model packaging/versioning
- Background in PySpark and distributed processing
- Knowledge of IDE-to-Databricks workflows such as Databricks Connect