We are looking for a capable Senior/Lead Machine Learning Engineer to join our remote team. The chosen candidate will have a significant role in the creation, development, and management of our ML pipeline, adopting industry-standard methodologies.
In this position, you will focus on building, deploying, maintaining, diagnosing, and improving components within the ML pipeline. Additionally, you will take the lead and contribute to the design and deployment of ML prediction endpoints. Partnering with System Engineers to establish the ML lifecycle management framework and advancing coding practices will be critical.
We welcome innovative individuals to become part of our dynamic team!
Responsibilities
- Contribute to the design, development, and management of an ML pipeline aligned with best practices
- Develop, deploy, maintain, troubleshoot, and improve ML pipeline components
- Lead the design and deployment of ML prediction endpoints
- Collaborate with System Engineers to create the ML lifecycle management framework
- Write specifications, documentation, and user guides for applications
- Improve coding practices and organize repositories within the scientific workflow
- Set up pipelines for different projects
- Identify technical risks and inconsistencies, proposing mitigation strategies
- Work with data scientists to operationalize predictive models, ensuring clear understanding of model objectives and purposes, and develop scalable data preparation pipelines
Requirements
- 5+ years of programming experience, with a focus on Python and strong SQL knowledge
- Proficiency in MLOps tools and frameworks (e.g., Sagemaker, Vertex, Azure ML)
- Background in Data Science, Data Engineering, and DevOps Engineering at an intermediate level
- Evidence of delivering at least one project in an MLE capacity
- Expertise in Engineering Best Practices
- Skills in utilizing the Apache Spark Ecosystem (Spark SQL, MLlib/SparkML) or equivalent technologies for Data Products
- Familiarity with Big Data technologies (e.g., Hadoop, Spark, Kafka, Cassandra, GCP BigQuery, AWS Redshift, Apache Beam, etc.)
- Proficiency in working with automated data pipeline and workflow management tools such as Airflow or Argo Workflow
- Understanding of various data processing paradigms, including batch, micro-batch, and streaming
- Experience with at least one major Cloud Provider, such as AWS, GCP, or Azure
- Production familiarity with integrating ML models into complex, data-intensive systems
- Knowledge of DS technologies such as Tensorflow, PyTorch, XGBoost, NumPy, SciPy, Scikit-learn, Pandas, Keras, Spacy, HuggingFace, Transformers
- Competency in working with multiple database types, including Relational, NoSQL, Graph, Document, Columnar, Time Series, etc.
Nice to have
- Background in Databricks MLOps-related tools or technologies, including MLFlow, Kubeflow, TensorFlow Extended (TFX)
- Proficiency in performance testing tools, such as JMeter or LoadRunner
- Understanding of containerization technologies like Docker