We are seeking a talented and proactive Senior AI Platform Backend Engineer (LLM) to join our team in Spain in a remote working mode and to lead the design, optimization and deployment of machine learning pipelines using MLOps in cloud environments. In this role, you will implement LLM-based solutions for chatbots and Retrieval-Augmented Generation (RAG) systems and develop robust DevOps/MLOps pipelines for production.
This position offers a flexible work setup, allowing for remote work or a hybrid arrangement, with occasional office visits.
Responsibilities
- Maintain and enhance CI/CD pipelines using tools such as GitHub Actions, AWS CodePipeline, Jenkins or ArgoCD
- Design and develop backend architecture for AI Verification and ChatGPT services utilizing Python and FastAPI
- Build, optimize and scale classifiers and tools leveraging machine learning, encoders and rule-based models
- Architect and implement solutions following Domain-Driven Design (DDD) and Test-Driven Development (TDD) best practices
- Design, develop, and maintain a production-grade LLM-as-a-judge service for verifying AI-generated content from source documents, leveraging frameworks such as HuggingFace Transformers, SpaCy, NLTK, and BM25
- Build and maintain high-throughput Retrieval-Augmented Generation (RAG) services, including ingestion pipelines and message brokers
- Perform prompt engineering, including techniques such as Chain-of-Thought and Few-Shot Learning, across various LLMs (OpenAI, Anthropic, Google, etc.)
- Provide hands-on expertise and support for one or more leading AI frameworks (TensorFlow, Keras, PyTorch, BERT, etc.)
- Demonstrate technical leadership in at least one AI specialization, such as graph recommendation systems, deep learning or natural language processing
Requirements
- Strong proficiency in Python and experience developing backend services with FastAPI
- Experience building and maintaining CI/CD pipelines using tools such as GitHub Actions, AWS CodePipeline, Jenkins or ArgoCD
- Hands-on experience designing and optimizing scalable machine learning models, including classifiers, encoders and rule-based systems
- Experience working with large language models (LLMs) and frameworks such as HuggingFace Transformers, Spacy, NLTK and BM25
- Proven ability to design, develop and maintain high-load Retrieval-Augmented Generation (RAG) services, including ingestion pipelines and message brokers