We are seeking a Senior SRE Engineer to join our dynamic team, which operates a core identity and profile management platform enabling personalized experiences across digital touchpoints such as web, mobile, and marketing.
Responsibilities
- Ensure system reliability, availability, and performance
- Automate infrastructure, deployment, and operations
- Implement monitoring, logging, and observability tools
- Define and track SLOs, SLIs, and Error Budgets
- Lead incident response, root cause analysis, and post-mortems
- Drive scalability and capacity planning initiatives
- Manage CI/CD pipelines for efficient software releases
- Optimize operational costs while maintaining performance
- Ensure security and compliance standards are met
- Document processes and provide training and mentorship
Requirements
- 3+ years of experience in SRE, DevOps, or related engineering roles
- Expertise in AWS services including Serverless, IAM, DynamoDB, and networking services
- Skills in Terraform and CDK (TypeScript) for infrastructure automation
- Proficiency with CI/CD tools such as Jenkins and GitHub Actions
- Understanding of observability tools including Prometheus, Grafana, and OpenSearch
- Familiarity with Kubernetes deployments and Helm charts
- Background in supporting 24/7 On-call operations
Nice to have
- Experience creating CI/CD pipelines with GitHub Actions