About the Position
We are looking for a Senior Data Architect who will design and govern a unified relational, graph, and vector data ecosystem. You will work closely with the client’s data team to ensure data is structured, available, and optimized for advanced AI and search use cases. This role focuses on building scalable ingestion pipelines, enabling real time data synchronization, and maintaining high data quality across the full platform.
About the Project
This project focuses on building a next generation data foundation that combines relational storage, graph relationships, and vector embeddings to enable intelligent search and AI driven insights. The platform ensures that incoming expert content is instantly processed and made query ready within strict service level expectations.
Responsibilities
- Design a unified data architecture that combines relational, graph, and vector data models
- Define and implement data governance frameworks including data lineage, data quality, and versioning
- Design and establish scalable ingestion pipelines to support near real time data processing and synchronization
- Develop change data capture strategies and define triggers for re vectorisation and graph updates
- Ensure data consistency and integrity across relational, graph, and vector components
- Translate business and analytical requirements into data models, mappings, and processing logic
- Configure identity and access management and enforce encryption standards for secure data handling
- Collaborate with engineering teams to ensure scalability, reliability, and observability of data pipelines
- Support monitoring and validation processes to ensure production ready data outputs
Requirements
- Experience designing data platforms, not only data pipelines
- Strong background in data governance, data lineage, and data quality practices
- Experience with real time data processing and synchronization, including change data capture
- Experience working with graph architectures and or vector based data systems
- Proficiency in SQL and Python for data processing and pipeline development
- Experience with Google Cloud Platform services such as BigQuery, Cloud Spanner, Pub Sub, and Dataflow
- Understanding of data security practices including IAM configuration and encryption standards
- Ability to translate business requirements into technical data architecture and solutions
- Experience collaborating with cross functional technical teams and supporting active development environments
Nice to Have
- Experience with vector databases and embedding lifecycle management
- Familiarity with graph data modeling and graph query languages
- Experience with Apache Beam or similar distributed processing frameworks
- Exposure to AI or machine learning driven data platforms
- Experience with observability tools for monitoring pipeline performance and reliability
Technologies
Google Cloud Platform, BigQuery, Cloud Spanner, Pub Sub, Dataflow, Vertex AI, vector databases, graph databases, Python, SQL, Apache Beam, IAM, encryption standards