As a Site Reliability Engineer, your prime responsibilities are related to a tight work with engineering's teams on product improvement in observability, reliability, and scalability.
Key Responsibilities:
- Tight work with the engineering and architecture teams on identifying resilience gaps, and building & executing a roadmap for their resolution
- Development of fully-automated GitOPS-based CI/CD process, and onboarding of new microservices to this process
- Implement SLI/SLO for K8S microservices and build the process to follow them. Identify observability gaps, and execute a roadmap for their mitigation
- React to production issues as an on-call engineer, participate in the RCA process, and write runbooks & automation to mitigate possible issues in the future
- Develop, test, execute & support disaster recovery plans for mission-critical services and sub-systems
- Capacity planning & cloud infra cost optimization
- Implement security & compliance requirements
Requirements:
- 3+ years of technical experience in the same or similar role supporting large-scale and high-load production systems
- Experience in the development and support of public cloud infrastructure
- Hands-on experience in running HA applications and development of the CI/CD process in Kubernetes
- Proven programming skills in Python, Go or similar
- Good knowledge of Linux environment, TCP/IP, network routing, DNS
- Familiar with SRE principles, DevOps practices, and modern cloud-native landscape
- Accuracy, attention to details, ability to follow processes
- Good communication skills with English level intermediate or above
Pluses:
- Experience working with contact centers, VoIP solutions;
- Ability to read and troubleshoot Java code if needed;
- Experience in SQL/NoSQL DB's or attitude to develop skills in this field.
We offer:
- Well-coordinated professional team
- Cutting edge technologies, interesting and challenging tasks, dynamic project, great opportunities for self-realization, professional and career growth
- Additional Health and Life Insurance Package
- Employee Assistance Program
- 25 vacation days