About the Position
We are looking for a Middle–Senior Data Engineer with AI experience and a strong background in building scalable data solutions and enabling AI-driven capabilities. This role combines deep expertise in data engineering with practical experience in integrating and supporting AI technologies.
The estimated salary for this position is up to USD 150,000 per year.
The position requires onsite attendance three days per week at the client’s location in New York City (NY), Iselin (NJ), or Charlotte (NC).
About the Project
The project involves migrating production data pipelines from legacy environments into a modern, cloud‑native data platform. The new platform enables domain‑oriented data products, scalable analytics, and embedded governance, with AI‑based tools supporting data quality, anomaly detection, privacy, and compliance.
Responsibilities
- Design and develop scalable ETL and ELT data pipelines
- Build and maintain data orchestration workflows using Apache Airflow or similar tools
- Collaborate with AI engineers to integrate LLMs into data‑driven applications
- Develop RAG pipelines using embeddings and vector‑based search
- Optimize Snowflake data models for performance and cost efficiency
- Contribute to cloud native application design and deployment
- Support integration or development of MCP servers where applicable
- Collaborate closely with product, data, and platform teams
Requirements
- Mid‑level candidates with 4+ years and Senior candidates with 10+ years of professional experience in software or data engineering.
- Strong experience building production-grade data pipelines
- Experience with Snowflake
- Hands-on experience with Apache Airflow or similar orchestration tools
- Solid experience with Snowflake, including data modeling and performance tuning
- Advanced SQL skills and working knowledge of NoSQL databases
- Strong Python development experience
- Experience working in cloud environments Azure, AWS or GCP
Nice to Have
- Hands on experience with large language models
- Experience with retrieval augmented generation patterns
- Experience with embeddings and vector databases
- Experience using Streamlit or similar tools for GenAI interfaces
- Exposure to MCP server development or integration
Technologies
Python, SQL, NoSQL, Snowflake, Airflow, LLM/RAG, Flask, Streamlit, Azure/AWS/GCP.